Object Detection using YOLOv5: A Simple Guide

YOLOv5 is one of the popular object detection algorithm YOLO (You Only Look Once) which was released in June 2020 and was developed by Glenn Jocher and the team at Ultralytics.

YOLOv5

YOLOv5 is a single-stage object detection algorithm, which means that it performs object detection in a single pass through the network, as opposed to the two-stage algorithms that use a separate region proposal network to generate region proposals before performing object detection. YOLOv5 builds on the previous versions of YOLO by using a more efficient architecture, which allows for faster and more accurate object detection.

YOLOv5 is based on a convolutional neural network (CNN) and uses a feature pyramid network to extract features at multiple scales. It also uses a more efficient back-bone architecture (CSPNet) to reduce the number of computations required during inference.

YOLOv5 is able to achieve state-of-the-art performance on several object detection benchmarks, while also being faster and more computationally efficient than previous versions of YOLO. YOLOv5 also has a more user-friendly and modular codebase, which makes it easier to customize and deploy for specific use cases.

Object Detection using YOLOv5

To perform object detection using the latest YOLO algorithm, we can use the yolov5 library in Python. Here is a code example that accepts an image file from the user, performs object detection on the image using the YOLOv5 algorithm, and draws a box with a caption around each detected object.

# Setup YOLOv5 Python Environment

# Create a project folder and move there
mkdir yolov5
cd yolov5

# Create and activate a Python environment using venv
python3 -m venv venv
source venv/bin/activate

# We should always upgrade pip as it's usually old version
# that has older information about libraries
pip install --upgrade pip

# Install PyTorch and related libraries
pip install torch torchvision matplotlib

# Install the required libraries for YOLOv5
pip install -qr https://raw.githubusercontent.com/ultralytics/yolov5/master/requirements.txt
# Import libraries for object detection
import torch

# Load YOLOv5 model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)

Here is an example image of a house with several objects that can be used for object detection:

House Sample Image
# Ask the user for the image file path
image_file = input("Enter the path to the image file: ")

# Output: Enter the path to the image file: /content/house_objects.jpeg

# A batch of images (only one entry here)
imgs = [image_file]

# Inference
results = model(imgs)

# Display the results
results.show()

This code first loads the YOLOv5 model using torch.hub.load(), then asks the user for the path to an image file. It then loads the image, and performs object detection on the image using the YOLOv5 model. It gets the bounding box coordinates and labels for each detected object, and draws a box with a caption around each detected object. Finally, it displays the image with the detected objects using results.show().

Object Detection Results

You can save the results into your local drive:

# Save the results
results.save()

Saved 1 image to runs/detect/exp

In the context of object detection using YOLOv5, results is an instance of the yolov5.utils.results class, which represents the results of object detection on a single image or a batch of images.

The results.print() method is a utility function that prints the detected objects in a human-readable format. Specifically, it prints the number of objects detected, their class labels, and their confidence scores. Here’s an example of what the output of results.print() might look like:

# Print the results
results.print()

image 1/1: 487x730 1 cup, 1 bowl, 1 chair, 1 couch, 4 potted plants, 2 books, 1 clock
Speed: 17.6ms pre-process, 16.3ms inference, 2.1ms NMS per image at shape (1, 3, 448, 640)

The results.xyxy attribute of the results object is a tensor that contains the bounding box coordinates of the detected objects in the image. Specifically, it is a tensor of shape (num_detections, 6), where num_detections is the number of objects detected, and the second dimension contains the following information for each object:

  • The x-coordinate of the top-left corner of the bounding box
  • The y-coordinate of the top-left corner of the bounding box
  • The x-coordinate of the bottom-right corner of the bounding box
  • The y-coordinate of the bottom-right corner of the bounding box
  • The confidence score for the detection
  • The class label for the detection

The code print(results.xyxy[0]) prints the bounding box coordinates and other information for the first detected object in the image to the console. Specifically, it prints a 1D tensor of shape (6,) containing the bounding box coordinates and other information for the first detected object.

For example, the output of print(results.xyxy[0]) might look like this:

print(results.xyxy[0])

tensor([[2.97351e+02, 2.57037e+02, 7.16655e+02, 3.95495e+02, 7.64580e-01, 5.70000e+01],
        [1.11764e+01, 3.06112e+01, 3.34656e+02, 3.87331e+02, 7.02564e-01, 5.80000e+01],
        [4.26627e+02, 9.65344e-01, 4.63796e+02, 6.02568e+01, 6.30188e-01, 5.80000e+01],
        [4.50530e+02, 3.84018e+02, 5.36325e+02, 4.10835e+02, 6.19044e-01, 7.30000e+01],
        [5.90081e+02, 3.36979e+02, 6.77699e+02, 4.12221e+02, 6.09831e-01, 4.50000e+01],
        [5.19352e-01, 3.08222e+02, 7.36477e+01, 4.53217e+02, 5.63769e-01, 5.60000e+01],
        [5.06017e+02, 3.42963e+02, 5.34100e+02, 3.74751e+02, 4.56984e-01, 4.10000e+01],
        [4.73861e+02, 7.29155e+01, 4.98293e+02, 1.14117e+02, 4.38774e-01, 5.80000e+01],
        [4.63090e+02, 0.00000e+00, 5.05947e+02, 2.49005e+01, 4.18405e-01, 7.40000e+01],
        [5.91055e+02, 9.34518e+00, 6.40597e+02, 5.83570e+01, 3.35947e-01, 5.80000e+01],
        [5.31882e+02, 1.95409e+01, 5.44199e+02, 5.88745e+01, 3.10988e-01, 7.30000e+01]], device='cuda:0')

YOLO can detect 80 object classes. We can print all supported classes as follows:

# Supported classes
print(model.names)

{0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}

Further Readings

If you’re interested in learning more about YOLOv5, here are some resources that you may find helpful:

  1. YOLOv5 GitHub Repository: This is the official repository for YOLOv5, which contains the source code, pre-trained models, and documentation.
  2. YOLOv5 Documentation: The documentation provides a detailed explanation of YOLOv5 architecture, configuration, training, evaluation, and deployment.
  3. YOLOv5 Performance on COCO Dataset: This is the paper that introduces YOLOv5, and it reports the performance of YOLOv5 on the COCO dataset, which is a standard benchmark for object detection.
  4. YOLOv5 Google Colab Notebook: This Colab notebook provides a tutorial on how to use YOLOv5 for object detection, including installation, training, evaluation, and deployment.
  5. YOLOv5 Video Object Detection Tutorial: This is a video tutorial that demonstrates how to use YOLOv5 for object detection on videos.
  6. YOLOv5 for Custom Object Detection Tutorial: This tutorial shows how to use YOLOv5 for custom object detection on your own dataset.

I hope these resources are helpful in learning more about YOLOv5 and its applications in object detection!

Leave a Reply

Your email address will not be published. Required fields are marked *