70 Independent Component Analysis

70.1 Independent Component Analysis

Independent Component Analysis (ICA) is a method quite similar to Principal Component Analysis. PCA aims to create a transformation that maximizes the variance of the resulting variables, while making them uncorrelated. ICA, on the other hand, aims to create variables that are statistically independent. Note that the ICA components are not assumed to be uncorrelated or orthogonal.

This allows ICA to pull out stronger signals in your data. It also doesn’t assume that the data is Gaussian.

One way to think about the difference between PCA and ICA, PCA can be used more effectively as a data compression technique, On the other hand, ICA helps uncover and separate the structure in the data itself.

The notion that ICA is a dimensionality reduction method is because the implementation of fastICA, which is commonly used, works incrementally.

ICA, much like PCA, requires that your data be normalized before it is applied.

TODO: add examples of results with NMIST

TODO: show correlation chart

70.2 Pros and Cons

70.2.1 Pros

Can identify stronger signals

70.2.2 Cons

Sensitive to noise and outliers
Computationally Intensive

70.3 R Examples

We will be using the ames data set for these examples.

library(recipes)
library(modeldata)

ames_num <- ames |>
  select(where(is.numeric))

{recipes} provides step_ica(), which is the standard way to perform ICA.

pca_rec <- recipe(~ ., data = ames_num) |>
  step_normalize(all_numeric_predictors()) |>
  step_ica(all_numeric_predictors())

pca_rec |>
  prep() |>
  bake(new_data = NULL) |>
  glimpse()

Rows: 2,930
Columns: 5
$ IC1 <dbl> 1.99105586, -0.20161317, 1.53104961, 1.70465862, -0.41520508, -0.3…
$ IC2 <dbl> 1.026887e-01, -9.251059e-01, 7.770603e-01, 4.319242e-01, -3.906650…
$ IC3 <dbl> 0.81998975, 0.57593842, 0.44017890, 0.81319895, -1.01663266, -1.06…
$ IC4 <dbl> 0.7763384, 0.9242795, 1.3092224, -0.1815926, -0.6331946, -0.749123…
$ IC5 <dbl> -0.36967273, 0.51367658, -0.80240209, 0.12388977, -0.65088383, -0.…

70.1 Independent Component Analysis

70.2 Pros and Cons

70.2.1 Pros

70.2.2 Cons

70.3 R Examples

70.4 Python Examples