Evaluation

Hardware

The question now is how well the trained network proves itself in practice. I have built a very simple system with an external USB camera: the camera should take a photo of the object on the wooden plate every time the space button is pressed and have it analyzed by the AI. The determined object class (glass, paper, metal or plastic) is displayed in the camera window, as well as the determined probabilities

Software

Initially, only the imports are made and the camera is started via OpenCV.

import numpy as np
import torch
import cv2
import torchvision
from torchvision import transforms

# start camera
cam = cv2.VideoCapture(1)

Loading The Model

ResNet18 is used as the model, the trained weights are loaded from the file "ResNet18_trained.pth" and the model is set to evaluation mode:

model = torchvision.models.resnet18()
model.fc = torch.nn.Linear(model.fc.in_features, 4)

model.load_state_dict(torch.load("models/ResNet18_trained.pth"))
model.eval()

If you want you can download the pretrained model from HERE

Transformations

Only the most important transformations are required: Normalization of the color channels and conversion into a tensor data structure.

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

Classification

The program waits for you to press the space bar. When the space bar is pressed, the image from the camera is converted into a tensor and passed to the neural network for classification.

classes = ["Glass", "Metal", "Paper", "Plastic"]

classname = "Unknown"

probabilities = []

while True:
    ret, frame = cam.read()
    if not ret :
        print("Error accessing camera.")
        break
    # wait for key pressed ...
    key = cv2.waitKey(1)

    # ord returns the Unicode value of the character
    # SPACE key pressed -> capture and classify image
    if key == ord(' '):
      
        pic_tensor = transform(frame).unsqueeze(0)
        output = model(pic_tensor)
        probabilities = torch.nn.functional.softmax(output, dim=1)[0].tolist()
        index = np.argmax(probabilities)
        classname = classes[index]

Press the q key to end the program:

    if key == ord('q'):
        break

Add text for the recognized class and the determined probabilities and display the image:

    cv2.putText(frame, classname, (50, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)
    probText = ""
    for p in probabilities:
        probText += f"{p:.2f} "
    cv2.putText(frame, probText, (50, 100), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
    cv2.imshow('Test', frame)

Test

The screeshot shows the recognized material class and below it the determined probability:

And here is a live video of a test: