• Vipul Vaibhaw

Training YOLO with circles instead of bounding box

Plato had said that "Geometry will draw the soul toward truth and create the spirit of philosophy". Intrigued from observation I asked myself a question, why does "bounding boxes" dominate the computer vision world so much? This article encapsulates my thought process feel free to critique it.

Triangles are the most structurally strongest shape in geometry. Circles are easiest to understand and represent, a center and a radius. Then why did the researchers went ahead with "bounding boxes"?

There are several reasons I believe. Before tackling things fundamentally we should understand a fact that any rectangle can be enclosed in a circle.

Read this amazing article for more understanding - http://mathcentral.uregina.ca/QQ/database/QQ.09.06/s/benneth1.html

Now this reasoning states that it should be logically possible to have a deep learning algorithm where circles can be used to enclose an object of interest rather than rectangular bounding boxes.

Let's look closely at the loss function which is designed by authors of YOLO architecture -

As we can see that we would need to change the loss function in a way where radius and center can be taken as ground truth. Honestly, I think it would simplify the loss function. But will it leave a loss function as cauchy? If it ain't cauchy will it converge? (topic of another blog, yay!)

Is that it? Will changing the loss function of the architecture solve the problem?

Well, another thing to consider here will be that images are inherently square. Matrices are square. So It is understandable why rectangular bounding boxes started to dominate computer vision world.

However my take on that would be that we can 'probably' add padding to matrices and images and make them circular.

But! What about the filters? Kernels/filters etc all are square! Is it possible to design circular filters? (A topic of another article).

All I can say is that "bounding box" property has been deeply ingrained and it makes no sense to reinvent the wheel because I am yet to see limitation of using bounding boxes. Anyways I enjoyed thinking about the topic. I hope you enjoyed reading about it as well! See you in another post.


Recent Posts

See All

©2019 by Deeplearned education pvt ltd