Litter Localization – Team Cobot

Fall 2019 Implementation

The flight data – images, metadata (RTK GPS of the UAV, attitude, and altitude) is stored on the UAV onboard computer in a time-stamped folder via trigger of a ROS service call on the UAV onboard computer from the server. After the UAV scout operation, the rosbags are transferred to the server via wireless secure copy (scp).

The figure below shows the localization pipeline. The rosbags from the UAV are processed to extract images, and metadata. The images are passed to the computer vision subsystem and corresponding litter pixel coordinates are obtained. Those pixel coordinates are then used to compute the world coordinates of the litter items using the formula in the second figure below.

K – camera intrinsics, which are obtained by calibrating the UAV onboard camera
R – Rotation matrix of the camera on the UAV
T – The x, y, z translation of the camera from the UAV center
u, v – Litter pixels in an image
X, Y, Z – Litter position in world coordinates (m)

This process is repeated for every litter object detected in every image. This results in a series of localized points in the world coordinate frame (denoted by the gray ‘+’ marks in the figure below). As you can see, distinct clusters start to form as the same item in multiple images is localized to roughly the same real-world coordinate. Using this aggregated data, we use K-means clustering to determine how many clusters there are in the environment and compute their cluster centers. This gives us an estimated value of each litter object’s actual location by taking the mean across multiple observations, thus reducing our error. The red marks in the figure denote the estimate litter locations, and as you can see, they are roughly in the center of the clusters.