Software Architecture

- Detection Container: Manages the drone’s perception (object detection, target GPS estimation) and related gimbal control (target tracking).
- Robot Container: Responsible for autonomous decision-making and execution, this container employs a custom behavior management system for high-level control.
- Uplink (GCS to Drone): Operator commands entered via the GCS UI trigger ROS 2 service calls or action goals. These are processed and forwarded via MAVROS to the autonomy system’s command interface (specifically, the
/behavior_tree_commands
topic consumed by theBehaviorExecutive
node inside the Robot container). - Downlink (Drone to GCS):
- Telemetry: Flight controller data streams via MAVLink are translated by MAVROS into ROS 2 topics. These topics are consumed by various processing nodes for analysis and visualization on the GCS interfaces.
- Video: GStreamer pipelines manage incoming camera feeds for display to the operator.
- Base Image: NVIDIA L4T (Linux for Tegra) base images (e.g.,
dustynv/ros:humble-desktop-l4t-r36.4.0
), ensuring compatibility with the Jetson architecture. - ROS 2 Humble Backbone: Includes core ROS 2 Humble and packages essential for robotics tasks:
mavros
: For communication with MAVLink-based flight controllers (e.g., Pixhawk).tf2
: For managing coordinate transformations.pcl-ros
: For Point Cloud Library integration.- Vision-related packages (
vision-msgs
,image-view
,stereo-image-proc
): For potential image data handling (though primary vision processing is in the Detection container).
- System Utilities: Installs essential system utilities and configures settings like timezone and locale (standardizing on
en_US.UTF-8
). - Development and Debugging Tools: Includes
cmake
, compilers (build-essential
),python3-pip
, text editors (vim
), terminal multiplexers (tmux
), and debuggers (gdb
). - Software Dependencies: Installs critical libraries:
libopencv-dev
: Core library for computer vision tasks.- Python Libraries:
numpy
,scipy
,pymavlink
,pyyaml
, etc.
- Remote Access Configuration (SSH): Installs and configures an
openssh-server
for remote access into the running container. - Layer 1: Interface
- Purpose: Acts as the crucial bridge between the ROS 2 autonomy software and the robot’s hardware (Pixhawk via MAVLink) or external communication links (like the GCS). It abstracts hardware specifics.
- Implementation: Leverages a
pluginlib
-based architecture centered around therobot_interface_node
.mavros_interface
: The primary plugin, handling communication with the MAVLink flight controller via MAVROS. It translates ROS commands to MAVLink and vice-versa.robot_interface
: Potentially provides interfaces for other hardware (non-MAVLink sensors) or custom communication protocols.interface_bringup
: Contains launch files to start and configure the necessary interface nodes and load the appropriate plugin.- ROS API: Exposes services (e.g.,
robot_command
for ARM, TAKEOFF, LAND) and subscribes to motion command topics (e.g.,velocity_command
,pose_command
), delegating execution to the loaded interface plugin. Publishes status likeis_armed
.
- Layer 2: Behavior
- Purpose: Implements the highest level of decision-making, orchestrating the robot’s actions based on GCS commands, situational awareness (e.g., inter-UAV collision risk from
/gps_list
), and internal state (e.g., armed status from MAVROS). - Architecture (Event-Driven Command Manager): Based on the
behavior_executive.cpp
implementation, this layer does not use a standard Behavior Tree library. Instead, it employs a custom, event-driven command management system orchestrated by theBehaviorExecutive
ROS 2 node. - Custom Primitives: Utilizes simple, custom-defined classes:
bt::Condition
: Serve as flags or triggers. Some represent external command requests (set via/behavior_tree_commands
), others reflect internal system states (updated via/mavros/state
).bt::Action
: Encapsulate the execution logic for specific, predefined behaviors (e.g.,AutoTakeoff
,GoToPosition
,GeofenceMapping
). Each manages a simple internal state (active, running, success, failure).
- Execution Engine (
BehaviorExecutive
node):- ROS Interface:
- Input/Triggers (Subscribers): Listens to
/behavior_tree_commands
(primary external trigger from GCS/RQT),/mavros/state
(for internal conditions), target waypoints,/gps_list
(deconfliction), etc. - Command Execution (MAVROS Service Clients): The logic within
bt::Action
objects calls MAVROS services to command the flight controller (e.g.,mavros/set_mode
,mavros/cmd/arming
,mavros/mission/push
).
- Input/Triggers (Subscribers): Listens to
- Execution Logic (
timer_callback
): A periodic timer (20Hz) drives the core loop.- No Tree Traversal: The loop iterates through a predefined list of
bt::Action
objects. - Activation Check: It checks if an
Action
has become newly active (likely triggered by its correspondingCondition
being set). - Direct Command Dispatch: When an Action activates, its logic executes, primarily involving calls to MAVROS services. The Action’s state (SUCCESS/FAILURE) is often set immediately after dispatch.
- No Tree Traversal: The loop iterates through a predefined list of
- ROS Interface:
- Handling Complex Sequences: Multi-step behaviors (like hexagonal search) are managed via custom logic and additional ROS timers within specific
Action
implementations, not via standard BT control flow nodes.
- Purpose: Implements the highest level of decision-making, orchestrating the robot’s actions based on GCS commands, situational awareness (e.g., inter-UAV collision risk from
- Purpose & Modules: Integrates several capabilities using internal modules:
- Video Stream Processing (
Streaming
module): Acquires input from the Gremsy gimbal-mounted Flir 640R camera via RTSP (using ffmpeg), handles frame retrieval, and resizes for publishing/inference. - Object Detection (
Detection
module /yolo.py
): Employs YOLO for real-time object detection and potentially pose estimation. Filters detections to identify persons (class ID “0”). - GPS Coordinate Estimation (
GPSEstimator
module): Estimates the real-world GPS coordinates of detected persons using the object’s bounding box and the drone’s current MAVROS state (GPS, altitude, heading). Manages a list of unique targets.
- Video Stream Processing (
- ROS Interface (Outputs):
- Publishes the processed video (
image_raw
). - Publishes raw detection bounding boxes (
/detection_box
–vision_msgs/Detection2DArray
). - Publishes newly estimated target GPS coordinates (
detection/gps_target
). - Publishes a periodic list of all uniquely tracked target GPS coordinates (
target_gps_list
).
- Publishes the processed video (
- Performance: Uses separate threads for streaming and detection/estimation.
- Purpose & Architecture: Controls the physical gimbal orientation based on external commands and detection feedback from
inted_gimbal
. Uses a state machine (MAPPING
,MANUAL_CONTROL
,LOCK_ON
,SPIN
). - ROS Interface:
- Inputs: Subscribes to
/gimbal_command
(state changes),/current_gimbal_angles
(feedback), and crucially/detection_box
(frominted_gimbal
). - Output: Publishes target gimbal angles (
/gimbal_angles
).
- Inputs: Subscribes to
- Workflow & State Logic:
- State Transitions: Driven by commands received on
/gimbal_command
. - Timer-Driven Actions: Executes predefined movements in
MAPPING
(point down) andSPIN
(sweep) states. - Detection-Driven Action (
detection_callback
inLOCK_ON
state):- Processes
Detection2DArray
messages frominted_gimbal
. - Uses the first detected person’s bounding box center.
- Calls
track.py::compute_gimbal_command
(using detection, current angles, camera intrinsics) to calculate new pitch/yaw needed to center the target. - Publishes the updated target angles to
/gimbal_angles
, implementing visual servoing.
- Processes
- State Transitions: Driven by commands received on
inted_gimbal
performs perception and publishes detection results on/detection_box
.payload_node
consumes these detections.- In the
LOCK_ON
state,payload_node
uses the detection data to calculate and send real-time angle commands to the gimbal, creating a closed-loop visual tracking system.
1. Software Architecture Overview
1.1. System Overview
The DARPA Triage Drone system is engineered to autonomously perform triage operations in complex environments. It utilizes a distributed, modular architecture designed for flexibility and robust operation. This architecture is realized through three core Docker containers, ensuring consistent runtime environments and network separation while enabling seamless coordination.
The system is logically divided into two main parts: the Ground Control Station (GCS) and the onboard drone systems.
1.2. High-Level Components
1.2.1. Ground Control Station (GCS)
Serving as the central command and monitoring hub, facilitated through the Fox Glove GUI, providing essential visualization and control interfaces.
1.2.2. Onboard Drone Systems
The drone’s intelligence and operational functions are encapsulated within dedicated Docker containers:
Together, these components create a comprehensive solution, enabling the drone to navigate challenging environments and effectively execute autonomous triage missions.
2. Ground Control Station (GCS) Subsystem
2.1. Overview & Purpose
The GCS subsystem provides the essential interface for controlling and monitoring the drone system, built upon a modular ROS 2 architecture deployed via Docker containers.
2.2. Data Flow Management
The system efficiently handles multiple asynchronous data streams using ROS 2 communication primitives (topics, services, actions) and specialized middleware:
3. Onboard System Architecture
3.1. Docker Containers Overview
The drone’s onboard intelligence runs within two primary Docker containers: the Robot Container (housing the autonomy and control logic) and the Detection Container (housing the perception and gimbal control logic). This containerization provides isolated, consistent, and reproducible runtime environments.
3.2. Robot Container: Autonomy Subsystem
3.2.1. Runtime Environment (Containerization Details)
The Robot Docker container is meticulously configured to provide the necessary operating system, libraries, and tools:
3.2.2. Software Architecture & Pipeline
The autonomy software, primarily located within the robot/ros_ws/src/autonomy
package, follows a layered architecture:
3.3. Detection Container: Perception & Gimbal Control Subsystem
3.3.1. Overview
This container houses the drone’s “eyes” and the control for directing them. It comprises two main nodes working together: inted_gimbal
for perception and payload_node
for gimbal control and target tracking.