Task Allocation

The task allocation subsystem efficiently assigns each robot tasks such that all tasks are completed in minimal time.

InputsOutputs
Occupancy grid map, n Robots, m Tasks, Robot positionsNext assigned task to robot i

Now that we have two types of tasks, frontier and coverage tasks, we need a task allocator that can assign each robot to these tasks and make intelligent decisions on which task will reap the most reward each assignment period. We implemented two task allocators: a naive algorithm as a baseline, and a more optimal task allocator, HIGH:

  • Naive Allocation
    • Greedy based on distance to task and utility
  • HIGH (Hierarchical Information Gain Heuristic) Allocation
    • Information-theoretic approach: task weight is encoded by its information gain
    • Balances exploration vs exploitation with weights based on explored area

Each task allocator has the following two objectives: the A* path length to each task as the distance cost, and an additional utility penalty if if the task is within line-of-sight distance to an already assigned task. This prevents robots from being assigned tasks that are very close to each other in the same assignment round.

HIGH uses the same costs, but additionally considers exploration of the environment vs exploitation of the already mapped areas for victim search. We are using an information-theoretic approach that weighs the naive costs, information gains, and exploration-exploitation weights to assign each robot to the task with the highest reward. We introduce two new objectives:

  • Information gain
    • The information gain is [0., 1.]
    • For frontier tasks, information gain is defined as the percent of continuous unknown cells within a radius.
    • For coverage tasks, information gain is defined as the percent of free cells within a radius that have not yet been covered by the camera sensor.
    • Tasks with high information gain should be prioritized in assignment, and tasks with low information gain should be not be assigned.
  • Exploration-exploitation weight
    • When we first start the mission, we want to rapidly explore the environment. Conversely, when the environment is mostly mapped out, we want to prioritize completing coverage tasks.
    • We use a linearly decaying importance for exploration as a function of percent of area explored.
    • The total area is approximated by the user-inputted geofence area.
Information Gain Heatmap

The final HIGH objective we want to maximize is then

reward_fn = (2 * utility – dist_cost) * info_gain * e2_weights

After calculating the highest priority tasks to assign next, we then optimally assign the tasks to each robot based on distance cost using the Hungarian algorithm. The total distance needed to be traveled for all robots is then minimized.

Through both in real life and simulation testing, we found that:

  • HIGH on average obtains a 27% better SST than Naive
  • HIGH on average has a 35% lower standard deviation of SSTs

HIGH is significantly better than naive because:

  • Naive allocator does not explore the map fast enough (decides to do nearby coverage tasks)
  • Wastes too much time in low information areas
  • Lacks the “big picture” view

HIGH is able to model information gain in the environment and exploit that to achieve better performance. It takes into consideration the key points that: coverage tasks which have been crossed before have low information, bigger frontiers have more information than smaller frontiers, and as search progresses information reward shifts from frontiers to coverage tasks.

Additionally, the task allocation subsystem is also able to dynamically reallocate tasks. For example, if an agent runs out of battery or crashes during search, the system is able to take the unfinished tasks and reallocate them to the remaining agents such that all tasks are still completed.