25  Leave One Out Encoding

Leave One Out Encoding, is a variation on target encoding Chapter 23. Where target encoding takes the mean of all rows within each target level, it instead excludes the value of the current row.

One of the main downsides to this approach is that since it needs the target which is most often the outcome and such not available for the test data set, it will thus not be able to do the row-wise adjustment and will behave exactly as the target encoding for the test data set.

What this does in practice is that it shifts the influence of outliers within each level away from the whole group and onto the outlier itself. Consider a level that has the following target values 100, 10, 6, 5, 3, 8. The target encoded value would be 22 and the leave one out values would be different, but the most different one is the outlier at 100.

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
βœ” dplyr     1.1.4     βœ” readr     2.1.5
βœ” forcats   1.0.0     βœ” stringr   1.5.1
βœ” ggplot2   3.5.1     βœ” tibble    3.2.1
βœ” lubridate 1.9.3     βœ” tidyr     1.3.1
βœ” purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
βœ– dplyr::filter() masks stats::filter()
βœ– dplyr::lag()    masks stats::lag()
β„Ή Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
tibble(
 values = c(100, 10, 6, 5, 3, 8)
) |>
  mutate(target = mean(values)) |>
  mutate(`leave one out` = (sum(values) - values) / (n() - 1)) |>
 knitr::kable()
values target leave one out
100 22 6.4
10 22 24.4
6 22 25.2
5 22 25.4
3 22 25.8
8 22 24.8

Thus we have that target encoding is influenced differently than leave one out encoding is. Which type of influence is better is up to you, the practitioner to determine based on your data and modeling problem.

25.2 Pros and Cons

25.2.1 Pros

  • Doesn’t hide the effort of outliers as compared to target encoding.
  • Can deal with categorical variables with many levels
  • Can deal with unseen levels in a sensible way

25.2.2 Cons

  • Only have a meaningful difference compared to target encoding to training data set.
  • Can be prone to overfitting

25.3 R Examples

Has not yet been implemented.

25.4 Python Examples

We are using the ames data set for examples. {category_encoders} provided the LeaveOneOutEncoder() method we can use. For this to work, we need to remember to specify an outcome when we fit().

from feazdata import ames
from sklearn.compose import ColumnTransformer
from category_encoders.leave_one_out import LeaveOneOutEncoder

from sklearn.preprocessing import TargetEncoder

ct = ColumnTransformer(
    [('loo', LeaveOneOutEncoder(), ['Neighborhood'])], 
    remainder="passthrough")

ct.fit(ames, y=ames[["Sale_Price"]].values.flatten())
ColumnTransformer(remainder='passthrough',
                  transformers=[('loo', LeaveOneOutEncoder(),
                                 ['Neighborhood'])])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
ct.transform(ames).filter(regex="loo.*")
      loo__Neighborhood
0            145097.350
1            145097.350
2            145097.350
3            145097.350
4            190646.576
...                 ...
2925         162226.632
2926         162226.632
2927         162226.632
2928         162226.632
2929         162226.632

[2930 rows x 1 columns]