K-fold Cross-Validation
Traditional train-test split (or 1 fold split) runs the risk that the split might unwittingly be bias towards certain features or labels.
By iterating the model training into k-times, with each iteration using a different training & validation split, we can avoid such biasness, though it is k-times computationally expensive.

cross_val_score is a compact function to obtain the all scoring values using kfold in one line.
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier
X = df[df.columns[1:-1]]
y = df['Cover_Type']
# using 5-fold cross validation mean scores
model = RandomForestClassifier()
cv_scores = cross_val_score(model, X, y,
scoring='accuracy', cv=5, n_jobs=-1)
print(np.mean(cv_scores))
For greater control, like to define our own evaluation metrics etc., we can use KFold to obtain the train & test indexes for each fold iteration. Sklearn's grid/random searches also allow cross validation together with model tuning.
from sklearn.model_selection import KFold
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import f1_score
def kfold_custom(fold=4, X, y, model, eval_metric):
kf = KFold(n_splits=fold)
score_total = []
for train_index, test_index in kf.split(X):
X_train, y_train = train[train_index][X_features], \
train[train_index][y_feature]
X_test, y_test = test[test_index][X_features], \
test[test_index][y_feature]
model.fit(X_train, y_train)
y_predict = model.predict()
score = eval_metric(y_test, y_predict)
score_total.append(score)
score = np.mean(score_total)
return score
model = RandomForestClassifier()
kfold_custom(X, y, model, f1score)
There are many other variants of cross validations available in sklearn, as shown below.