Skip to content Skip to sidebar Skip to footer

Confusion Matrix And Classification Report Of Stratifiedkfold

I am using StratifiedKFold to checking the performance of my classifier. I have two classes and I trying to build Logistic Regression classier. Here is my code skf = StratifiedKFo

Solution 1:

Cross validation is used to asses the performance of particular models or hyperparameters across different splits of a dataset. At the end you don't have a final performance per se, you have the individual performance of each split and the aggregated performance across splits. You could potentially use the tn, fn, fp, tp for each to create an aggregated precision, recall, sensitivity, etc... but then you could also just use the predefined functions for those metrics in sklearn and aggregate them at the end.

e.g.

skf = StratifiedKFold(n_splits=10, shuffle=True, random_state=0)
accs, precs, recs = [], [], []
for train_index, test_index in skf.split(x, y):
    x_train, x_test = x[train_index], x[test_index]
    y_train, y_test = y[train_index], y[test_index]

    tfidf = TfidfVectorizer()
    x_train = tfidf.fit_transform(x_train)
    x_test = tfidf.transform(x_test)

    clf =  LogisticRegression(class_weight='balanced')
    clf.fit(x_train, y_train)
    y_pred = clf.predict(x_test)
    acc = accuracy_score(y_test, y_pred)
    prec = precision_score(y_test, y_pred)
    rec = recall_score(y_test, y_pred)
    accs.append(acc)
    precs.append(prec)
    recs.append(rec)
    print(f'Accuracy: {acc}, Precision: {prec}, Recall: {rec}')

print(f'Mean Accuracy: {np.mean(accs)}, Mean Precision: {np.mean(precs)}, Mean Recall: {np.mean(recs)}')

Post a Comment for "Confusion Matrix And Classification Report Of Stratifiedkfold"