Training Process

Selecting a Model

The choice of a neural network depends very much on the task and requires experience. Fortunately, the digit dataset is so easy to learn that even a very simple neural network can do it without any problems.

We can use the “MLPClassifier” network from SCIKIT-LEARN for this. The network automatically adapts the input layer to the problem size, in our case it also consists of 8 x 8 = 64 neurons. The output layer results from the number of objects to be classified, for the digits from 0 to 9 we need 10 neurons in the output layer.

The following program generates a simple feed-forward network with 50 neurons in the hidden layer and outputs the most important parameters of the network.

from sklearn.neural_network import MLPClassifier

model = MLPClassifier(hidden_layer_sizes=(50,), max_iter=500)
params = model.get_params()
for key, value in params.items():
    print(key + " : " + str(value))

Preparing the Dataset

The division into training and test data is a central component of machine learning. Here is a brief explanation with reasons:

Training data: This data is used by the model to learn patterns - i.e. the “learning phase”.Test data: This data is used after the model has been trained to evaluate performance - without the model having seen this data beforehand.

Why is this important? To avoid overfitting: If you train the model on all available data, it could "memorize" the data instead of learning real patterns. This is called overfitting. It would then be good on the training data, but bad on new data.

Objective evaluation of model performance: Test data is the only way to really assess how well the model works on unknown data - i.e. in reality.

The usual split is 70 to 80% training and 20 to 30% test data. We use a 30 / 70 split here. The following program splits the data and the labels 70 to 30 and outputs the number of the respective images and labels

from sklearn import datasets
from sklearn.model_selection import train_test_split

# Datensatz laden
digits = datasets.load_digits()


# 70% sind Trainingsdaten, 30% sind Testdaten
bilder_train, bilder_test, labels_train, labels_test = train_test_split(
    digits.data, digits.target, test_size=0.3, random_state=42)

print("Bilder für das Training: " , len(bilder_train) )
print(bilder_train)
print("Labels für das Training: " , len(labels_train))
print(labels_train)
print("Bilder für den Test:  " , len(bilder_test) )
print(bilder_train)
print("Labels für den Test:" , len(labels_test))
print(labels_train)

Training

the training is as simple as possible: you set the number of maximum epochs in the model, call up the “fit” method and transfer the training data and labels. The parameter “verbose” provides an output of the training progress.
There is also a very simple “score” method for the evaluation.

from sklearn.neural_network import MLPClassifier
from sklearn import datasets
from sklearn.model_selection import train_test_split
import joblib

# Datensatz laden
digits = datasets.load_digits()

# 70% sind Trainingsdaten, 30% sind Testdaten
bilder_train, bilder_test, labels_train, labels_test = train_test_split(
    digits.data, digits.target, test_size=0.3, random_state=42)

# Modell erzeugen: 50 neuronen in der hidden layer
model = MLPClassifier(hidden_layer_sizes=(50,), max_iter=500, verbose = True)

# Training
model.fit(bilder_train, labels_train)

# Auswertung
print("Genauigkeit mit Trainingsdaten nach Training:", model.score(bilder_train, labels_train))
print("Genauigkeit mit Testdaten nach Training:", model.score(bilder_test, labels_test))

# Modell speichern
joblib.dump(model, 'digits_model.pkl')

The program for the traininig then provides the following output:

.

you can see that the optimizer has aborted the training process after 168 iterations because there has been no further improvement in accuracy in the last 10 epochs.
If you look at the accuracy achieved, you can see that 100% was achieved for the training data and 97% for the test data.

.