Regression

Regression is done when the label is a continuous variable.

The three regressors, LASSO, Ridge and Elastic Net that have regularization are elaborated here.

OLS Regression

Ordinary Least Squares Regression or OLS Regression is the most basic form and fundamental of regression. A best fit line ŷ = a + bx is drawn based on the ordinary least squares method. i.e., least total area of squares (sum of squares) with length from each x,y point to regresson line.

OLS can be conducted using statsmodel package.

model = smf.ols(formula='diameter ~ depth', data=df3).fit()
print model.summary()



OLS Regression Results
==============================================================================
Dep. Variable:               diameter   R-squared:                       0.512
Model:                            OLS   Adj. R-squared:                  0.512
Method:                 Least Squares   F-statistic:                 1.895e+04
Date:                Tue, 02 Aug 2016   Prob (F-statistic):               0.00
Time:                        17:10:34   Log-Likelihood:                -51812.
No. Observations:               18067   AIC:                         1.036e+05
Df Residuals:                   18065   BIC:                         1.036e+05
Df Model:                           1
Covariance Type:            nonrobust
==============================================================================
coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
Intercept      2.2523      0.054     41.656      0.000         2.146     2.358
depth         11.5836      0.084    137.675      0.000        11.419    11.749
==============================================================================
Omnibus:                    12117.030   Durbin-Watson:                   0.673
Prob(Omnibus):                  0.000   Jarque-Bera (JB):           391356.565
Skew:                           2.771   Prob(JB):                         0.00
Kurtosis:                      25.117   Cond. No.                         3.46
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Or sci-kit learn package.

from sklearn import linear_model

reg = linear_model.LinearRegression()
model = reg.fit([[0, 0], [1, 1], [2, 2]], [0, 1, 2])

print(model)
# LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)
print(reg.coef_)
# array([ 0.5,  0.5])

r2_trains = model.score(X_train, y_train)
r2_tests = model.score(X_test, y_test)

LASSO Regression

LASSO refers to Least Absolute Shrinkage and Selection Operator Regression. When alpha = 0, then it is a normal OLS regression.

import pandas as pd
import numpy as py
from sklearn import preprocessing
from sklearn.cross_validation import train_test_split
from sklearn.linear_model import LassoLarsCV
import sklearn.metrics
from sklearn.datasets import load_boston
from sklearn.metrics import mean_squared_error


for i in df.columns[:-1]:
  df[i] = preprocessing.scale(df[i].astype('float64'))
df.describe()


train_feature, test_feature, train_target, test_target = \
train_test_split(feature, target, random_state=123, test_size=0.2)

model=LassoLarsCV(cv=10, precompute=False)
model = model.fit(train_feature,train_target)


# Compare the regression coefficients, and see which one LASSO removed.
# LSTAT is the most important predictor, 
# followed by RM, DIS, and RAD. AGE is removed by LASSO

df2=pd.DataFrame(model.coef_, index=feature.columns)
df2.sort_values(by=0,ascending=False)
# RM    3.050843
# RAD   2.040252
# ZN    1.004318
# B     0.629933
# CHAS  0.317948
# INDUS 0.225688
# AGE   0.000000
# CRIM  -0.770291
# NOX   -1.617137
# TAX   -1.731576
# PTRATIO       -1.923485
# DIS   -2.733660
# LSTAT -3.878356

train_error = mean_squared_error(train_target, model.predict(train_feature))
test_error = mean_squared_error(test_target, model.predict(test_feature))

# MSE
print ('training data MSE')
print(train_error)
print ('test data MSE')
print(test_error)


# R-square
rsquared_train=model.score(train_feature,train_target)
rsquared_test=model.score(test_feature,test_target)
print ('training data R-square')
print(rsquared_train)
print ('test data R-square')
print(rsquared_test)

Ridge Regression

import panda as pd
import numpy as np
from sklearn.linear_model import Ridge
from sklearn.preprocessing import MinMaxScaler

X_train, X_test, y_train, y_test = train_test_split(X_crime, y_crime,
                                                 random_state = 0)


scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)


linridge = Ridge(alpha=20.0).fit(X_train_scaled, y_train)

print('ridge regression linear model intercept: {}'
   .format(linridge.intercept_))
print('ridge regression linear model coeff:\n{}'
   .format(linridge.coef_))
print('R-squared score (train): {:.3f}'
   .format(linridge.score(X_train_scaled, y_train)))
print('R-squared score (test): {:.3f}'
   .format(linridge.score(X_test_scaled, y_test)))
print('Number of non-zero features: {}'
   .format(np.sum(linridge.coef_ != 0)))

Elastic Net

Elastic Net combines the penalties of ridge regression and lasso to get the best of both worlds.

Tree Regressors

For each of the tree classifiers, there exists a regressor.

from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor