Skip to content

Model Concepts

Bias-Variance Tradeoff

The best predictive model is one that has good generalization ability. With that, it will be able to give accurate predictions to new and previously unseen data. With that comes two terms bias & variance.

High Bias results from Underfitting the model. This usually results from erroneous assumptions, and cause the model to be too general.

High Variance results from Overfitting the model, and it will predict the training dataset very accurately, but not with unseen new datasets. This is because it will fit even the slightless noise in the dataset.

The best model with the highest accuarcy is the middle ground between the two.

from Andrew Ng’s lecture

Regularization

Regularization is an important concept for some supervised models. It prevents overfitting by restricting the model, thus lowering its complexity.

In regression, it is regulated by the hyperparameter alpha, where a high alpha means more regularization and a simpler model.

Regularization Model Desc
L1 LASSO reduces sum of the absolute values of coefficients. change unimportant features' regression coefficients into 0
L2 Ridge reduces the sum of squares

In classification, it is regulated by the hyperparameter C, where a lower C means more regularization and a simpler model.

Model Regularization Desc
SVC L2 -
LinearSVC L1/L2 Can choose which regularization type
Logistic Regression L1/L2/ElasticNet Can choose which regularization type