YOLO stands for You Only Look Once, which is a real-time object detection system. In this notebook, we will use a pre-trained YOLO model to detect objects in images. The sample images that are provided in the images folder that comes with the dataset are listed below.

  • apple.jpg
  • cat.jpg
  • street.jpg
  • dog.jpg
  • dog2.jpg
  • eagle.jpg
  • food.jpg
  • food2.jpg
  • giraffe.jpg
  • horses.jpg
  • manbike.jpg
  • manbike2.jpg
  • motorbike.jpg
  • newyork.jpg
  • newyork2.jpg
  • person.jpg
  • surf.jpg
  • wine.jpg

Import Resources

Before using this notebook, please ensure that OpenCV and Pytorch packages are installed so that the Jupyter notebook can access it.

In [49]:
import cv2
import matplotlib.pyplot as plt

from utils import *
from darknet import Darknet

Load the pre-trained model

YOLOv3 is the version three of the YOLO system (YOLOv3 Paper). The neural network model architecture is stored in the yolov3.cfg file, and the pre-trained weights of the neural network are stored in yolov3.weights. There is a file called coco.names that has the list of 80 object class that the model will be able to detect. The model has been trained only on these 80 object classes.

In [50]:
# Copy the neural network architecture
cfg_file = 'yolov3.cfg'

# Copy the pre-trained weights
weight_file = 'yolov3.weights'

# Copy the names of the classes
namesfile = 'coco.names'

# Load the neural network architecture
m = Darknet(cfg_file)

# Load the pre-trained weights
m.load_weights(weight_file)

# Load the names of the classes
class_names = load_class_names(namesfile)

## You can visualize the neural network in YOLOv3 by uncommenting the following command
# m.print_network()
Loading weights. Please Wait...100.00% Complete

Let us load an image for detection. Note that all images are of different size; however, the model takes images of specific size as input. Therefore, we need to resize the images. In order to run the model on your own image, please load the image in the images folder and modify the name of the image below.

In [51]:
# Load the image
img = cv2.imread('./images/manbike.jpg')

# Converting the image to RGB
original_image = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Resizing the image
resized_image = cv2.resize(original_image, (m.width, m.height))

# Display the images
plt.subplot(121)
plt.title('Input Image')
plt.imshow(original_image)
plt.subplot(122)
plt.title('Resized Input Image')
plt.imshow(resized_image)
plt.show()

Run the model on the resized image

In [52]:
# Non-Maximal Suppression Threshold is a parameter to keep only the best bounding box.
# Bounding boxes with detection probability less than 0.6 are not selected.
nms_threshold_probability = 0.6

# Intersection Over Union Threshold is a parameter that eliminates boxes with high intersection over union values.
# Bounding boxes with intersection over union of more than 0.4 are not selected.
iou_threshold_probability = 0.4

# The following command detects the objects in the image
boxes = detect_objects(m, resized_image, iou_threshold_probability, nms_threshold_probability)

# The following command prints the objects identified and the confidence level
print_objects(boxes, class_names)

#Plot the image with bounding boxes and corresponding object class labels
plot_boxes(original_image, boxes, class_names, plot_labels = True)

It took 1.230 seconds to detect the objects in the image.

Number of Objects Detected: 2 

Objects Found and Confidence Level:

1. bicycle: 0.999869
2. person: 1.000000