Software Documentation

UML Diagram

Maintaining synchronicity with simulation and RL agent for training

Threading Flow Chart

The simulator and agent are implemented in a ROS framework. The agent publishes actions and the simulator publishes observations. However, both of these processes are asynchronous. We need to pause the simulation after the execution of an action and make it wait until the next action is given. This is implemented as described in the flowchart shown in the above figure.

The stable-baseline(sb) thread (thread running the training/optimisation process) issues an action and this action is published. The sb thread is then blocked and the callback loop of the ROS node is released. The ROS node thread continues to execute the action and monitors the current environment state, to look for termination conditions. The ROS node thread blocks itself when it detects a termination condition and releases the sb thread. This way we can pause and resume the sim. This functionality is enough to ensure the synchronicity of the action and reward feedback loop necessary for the RL agent training process.

Integration of CARLA with the system in a synchronous setting

(1)

The process begins with the CARLA handler and the Traffic Manager establishing a connection to the CARLA server/.

(2)

The CARLA handler then spawns the ego vehicle in CARLA while the traffic manager spawns the NPCs.

(3)

The CARLA handler then pauses the simulation.

(4)

The state information of the environment is extracted from the paused simulation. The state information consists of the position, orientation and velocities of all vehicles and pedestrians in the simulation.

(5)

The state information is sent to the RL Node.

(6)

The path planning node provides the CARLA handler with the next set of waypoints to track.

(7)

The waypoints are used to generate a control signal for the ego vehicle. The traffic manager creates control signals for all NPCs in parallel.

(8)

The control signals generated above are assigned to their corresponding vehicles in simulation.

(9)

The CARLA handler then resumes the simulation for one time step.

(10)

The CARLA handler pauses the simulation after one time step and the loop defined above begins again.