Block Detection and Classification with YOLOv2

In order to detect and classify the blocks in the field of view of the camera,
we will be using the YOLOv2 deep neural network. This networks takes

as input RGB frames from the realsense, and regresses the bounding box lo-
cation and size. It also provides the bounding box class (in our case one of 3

classes: Red, Blue, Green), and a confidence measure of the detection. A big
advantage is that it works in real time while providing competent prediction
and classification accuracy.