Sliding Window Informative Canonical Correlation Analysis
Abstract
Canonical correlation analysis (CCA) is a technique for finding correlated sets of features between two datasets. In this paper, we propose a novel extension of CCA to the online, streaming data setting: Sliding Window Informative Canonical Correlation Analysis (SWICCA). Our method uses a streaming principal component analysis (PCA) algorithm as a backend and uses these outputs combined with a small sliding window of samples to estimate the CCA components in real time. We motivate and describe our algorithm, provide numerical simulations to characterize its performance, and provide a theoretical performance guarantee. The SWICCA method is applicable and scalable to extremely high dimensions, and we provide a real-data example that demonstrates this capability.