The MNIST database is a large database of handwritten digits. Each row of the dataset contains an image as a 784 element vector. The scalars represent the pixels in gray scale of a 28X28 pixel image of a digit.¶

In [1]:
from sklearn.neural_network import MLPClassifier
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Load MNIST training and testing data files¶

In [2]:
#Dataset has 60,000 rows and 28X28 columns
data_train = pd.read_csv("mnist-train.csv")
#Converting data into floating type
data_train = data_train.astype(float)

#Dataset has 10,000 rows and 28X28 columns
data_test = pd.read_csv("mnist-test.csv")
#Converting data into floating type
data_test = data_test.astype(float)

Save data as x_train, y_train, x_test and y_test¶

In [3]:
y_train = data_train["label"]
x_train = data_train.drop("label",axis=1)

y_test = data_test["label"]
x_test = data_test.drop("label",axis=1)

Function to plot a row of MNIST data as image¶

In [4]:
def show_img(img_data):
    img_data = np.array(img_data)
    for single_img in img_data:
        plt.figure()
        single_img_reshaped = np.reshape(single_img, (28, 28))
        plt.imshow(single_img_reshaped,cmap='gray')
        plt.show()

Plotting a couple of images (rows 0 and 1) using the above function¶

In [5]:
show_img(x_train.iloc[[0,1],:])

Learn using a neural network with two hidden layers, each containing 10 neurons¶

In [11]:
clf = MLPClassifier(hidden_layer_sizes=(10, 10))
clf.fit(X=x_train,y=y_train)

#For regression using neural network import the following
#from sklearn.neural_network import MLPRegressor

Checking the training accuracy¶

In [7]:
y_true = y_train
y_pred = clf.predict(x_train)
print("Training accuracy",clf.score(X=x_train,y=y_train))
Training accuracy 0.8898833333333334

Predict a few rows from the testing data file¶

In [18]:
indices = [0,1,2,3,4,5,6,7,8,9,10,11,12]
c_predicted = clf.predict(x_test.iloc[indices,:])
c_true = y_test.iloc[indices]
print("True:",list(c_true))
print("Pred:",list(c_predicted))

# Use the following command to see the images
show_img(x_test.iloc[indices,:])
True: [7.0, 2.0, 1.0, 0.0, 4.0, 1.0, 4.0, 9.0, 5.0, 9.0, 0.0, 6.0, 9.0]
Pred: [7.0, 2.0, 1.0, 0.0, 4.0, 1.0, 4.0, 9.0, 4.0, 9.0, 0.0, 6.0, 9.0]

Use the following function to compute the testing accuracy¶

In [19]:
print("Testing accuracy",clf.score(X=x_test,y=y_test))
Testing accuracy 0.8748
In [ ]: