library(recipes)
library(modeldata)
<- ames |>
ames_num select(where(is.numeric))
70 Independent Component Analysis
70.1 Independent Component Analysis
Independent Component Analysis (ICA) is a method quite similar to Principal Component Analysis. PCA aims to create a transformation that maximizes the variance of the resulting variables, while making them uncorrelated. ICA, on the other hand, aims to create variables that are statistically independent. Note that the ICA components are not assumed to be uncorrelated or orthogonal.
This allows ICA to pull out stronger signals in your data. It also doesnβt assume that the data is Gaussian.
One way to think about the difference between PCA and ICA, PCA can be used more effectively as a data compression technique, On the other hand, ICA helps uncover and separate the structure in the data itself.
The notion that ICA is a dimensionality reduction method is because the implementation of fastICA, which is commonly used, works incrementally.
ICA, much like PCA, requires that your data be normalized before it is applied.
70.2 Pros and Cons
70.2.1 Pros
- Can identify stronger signals
70.2.2 Cons
- Sensitive to noise and outliers
- Computationally Intensive
70.3 R Examples
We will be using the ames
data set for these examples.
{recipes} provides step_ica()
, which is the standard way to perform PCA.
<- recipe(~ ., data = ames_num) |>
pca_rec step_normalize(all_numeric_predictors()) |>
step_ica(all_numeric_predictors())
|>
pca_rec prep() |>
bake(new_data = NULL) |>
glimpse()
Rows: 2,930
Columns: 5
$ IC1 <dbl> -0.37052169, 0.51413974, -0.80280637, 0.12280549, -0.65078105, -0.β¦
$ IC2 <dbl> -0.104340006, 0.924875720, -0.778555091, -0.433058679, 0.391229969β¦
$ IC3 <dbl> -1.99091152, 0.20060881, -1.53014169, -1.70455188, 0.41543549, 0.3β¦
$ IC4 <dbl> 0.7762583, 0.9242792, 1.3091516, -0.1819816, -0.6325621, -0.748529β¦
$ IC5 <dbl> 0.81982434, 0.57624587, 0.44016790, 0.81289679, -1.01678079, -1.06β¦