Time-Series Data
113
🏗️ Log Interval
Feature Engineering A-Z
Preface
Introduction
Numeric Features
1
Numeric Overview
2
Logarithms
3
Square Root
4
Box-Cox
5
Yeo-Johnson
6
Percentile Scaling
7
Normalization
8
Range Scaling
9
Max Abs Scaling
10
Robust Scaling
11
Binning
12
Splines
13
Polynomial Expansion
14
Arithmetic
Categorical Features
15
Categorical Overview
16
Cleaning
17
Unseen Levels
18
Dummy Encoding
19
Label Encoding
20
Ordinal Encoding
21
Binary Encoding
22
Frequency Encoding
23
Target Encoding
24
Hashing Encoding
25
Leave One Out Encoding
26
Leaf Encoding
27
GLMM Encoding
28
Catboost Encoding
29
Weight of Evidence Encoding
30
James-Stein Encoding
31
M-Estimator Encoding
32
Thermometer Encoding
33
Quantile Encoding
34
Summary Encoding
35
Collapsing Categories
36
Categorical Combination
37
Multi-Dummy Encoding
Datetime Features
38
Datetime Overview
39
Value Extraction
40
Advanced Features
41
Periodic Features
Missing Data
42
Missing Overview
43
Simple Imputation
44
Model Based Imputation
45
Missing Values Indicators
46
Remove Missing Values
Text Features
47
Text Overview
48
Manual Text Features
49
Text Cleaning
50
Tokenization
51
Stemming
52
N-grams
53
Stop words
54
Token Filter
55
Term Frequency
56
TF-IDF
57
Token Hashing
58
Sequence Encoding
59
LDA
60
word2vec
61
BERT
Periodic Features
62
Periodic Overview
63
Trigonometric
64
Periodic Splines
65
Periodic Indicators
Too Many Variables
66
Too Many Overview
67
Zero Variance Filter
68
Principal Component Analysis
69
Principal Component Analysis Variants
70
Independent Component Analysis
71
Non-Negative Matrix Factorization
72
Partial Least Squares
73
Linear Discriminant Analysis
74
LDA Variants{#sec-too-many-gda}
75
Autoencoders
76
Uniform Manifold Approximation and Projection
77
ISOMAP
78
Filter based feature selection
Correlated Data
79
Correlated Overview
80
High Correlation Filter
Outliers
81
Outliers Overview
82
Identify
83
Outlier Removal
84
Imputation
85
Indicate
Imbalanced Data
86
Imbalanced Overview
87
Up-Sampling
88
🏗️ ROSE
89
SMOTE
90
SMOTE Variants
91
Down-Sampling
92
🏗️ Near-Miss
93
🏗️ Tomek Links
94
🏗️ Condensed Nearest Neighbor
95
🏗️ Edited Nearest Neighbor
96
🏗️ Instance Hardness Threshold
97
🏗️ One Sided Selection
Miscellaneous
98
Miscellaneous Overview
99
IDs
100
Colors
101
🏗️ Zip Codes
102
🏗️ Emails
Spatial
103
Spatial Overview
104
🏗️ Spatial Distance
105
🏗️ Spatial Nearest
106
🏗️ Spatial Count
107
🏗️ Spatial Query
108
🏗️ Spatial Embedding
109
🏗️ Spatial Characteristics
Time-Series Data
110
Time-series Overview
111
🏗️ Smoothing
112
🏗️ Sliding
113
🏗️ Log Interval
114
🏗️ Time series Missing values
115
🏗️ Time Series outliers
116
🏗️ Differences
117
🏗️ Lagging Features
118
🏗️ Rolling Window
119
🏗️ Expanding Window
120
🏗️ Fourier Features
121
🏗️ Wavelet
Image Data
122
Image Overview
123
🏗️ Edge and corner detection
124
🏗️ Texture Analysis
125
🏗️ Greyscale conversion
126
🏗️ Color Modifications
127
🏗️ Noise Reduction
128
🏗️ Value Normalization
129
🏗️ Resizing
130
🏗️ Changing Brightness
131
🏗️ Shifting, Flipping, and Rotation
132
🏗️ Cropping and Scaling
133
🏗️ Image embeddings
Ralational Data
134
Relational Overview
135
🏗️ Manual
136
🏗️ Automatic
Video Data
137
Video Overview
138
🏗️ Temporary
Sound Data
139
Sound Overview
140
🏗️ Temporary
141
🏗️ Order of transformations
142
🏗️ What should you do if you have sparse data?
143
🏗️ How Different Models Deal With Input
144
🏗️ Summary
References
Table of contents
113.1
Log Interval
113.2
Pros and Cons
113.2.1
Pros
113.2.2
Cons
113.3
R Examples
113.4
Python Examples
Edit this page
Report an issue
View source
Time-Series Data
113
🏗️ Log Interval
113
🏗️ Log Interval
113.1
Log Interval
WIP
113.2
Pros and Cons
113.2.1
Pros
113.2.2
Cons
113.3
R Examples
113.4
Python Examples
112
🏗️ Sliding
114
🏗️ Time series Missing values