Member-only story

YOLOv4: The Subtleties of High-Speed Object Detection

Bochkovskiy, Wang & Liao: balancing speed, accuracy and weight

8 min readJul 19, 2020

One of the more impressive things you can do is look at things and understand exactly what you’re seeing. Your brain is constantly receiving two feeds of photon interpretations and somehow you’re able to say “this is a pineapple and this is pizza” — and you then know never to combine the two.

Object recognition is impressive because it’s 1) abstract (babies have no idea what’s going on) and 2) damn hard to learn. As AI improves, we’re constantly grappling with this challenge, which consists of detection (something is here) and classification (what is this).

That being said, it’s a much easier task to identify open parking spots in a still image at a leisurely pace than it is to identify “CAR” “BICYCLE” “RED LIGHT” at 40mph.

One of the more promising object-focused CNNs is “You Only Look Once”, a neat open-source model written mostly in C/C++ and assembled with Python.
Here’s a video of “YOLOv3” in action that speaks for itself:

YOLOv4: The Subtleties of High-Speed Object Detection

Bochkovskiy, Wang & Liao: balancing speed, accuracy and weight

Written by Mark Cleverley

No responses yet