System Performance

Spring Validation Demo – Performance Evaluation

1. Bimanual Simulation Results

In the Gazebo simulation environment, a dual-arm robotic system was configured, with one arm designated for the gripper and the other equipped with a pruner as its end-effector. The bell pepper was initialized in a fully visible pose, and the workspace was made free of obstacles to closely approximate the conditions we had defined for the SVD demo. Following extensive trials and validation runs, the results from ten executions are summarized in Table 1.

Table 1: Results of Validation Runs in Simulation

No.	Result	Simulated Pose (xyz, roll pitch yaw)	Notes
1.	✅	[0.28, -0.46, 0.53, 0.26, -0.26, 0.0]	Success
2	✅	[0.45, -0.63, 0.48, 0.38, 0.43, 0.0]	Success
3	⚠️	[0.3, -0.5, 0.57, -0.46, 0.37, 0.0]	Success, but cutter had risky pathing
4	✅	[0.34, -0.56, 0.57, 0.13, 0.18, 0.0]	Success
5	✅	[0.31, -0.46, 0.46, -0.01, 0.46, 0.0]	Success
6	✅	[0.27, -0.46, 0.49, -0.04, -0.3, 0.0]	Success
7	✅	[0.27, -0.63, 0.49, -0.33, 0.27, 0.0]	Success
8	✅	[0.25, -0.54, 0.45, -0.27, -0.27, 0.0]	Success
9	✅	[0.26, -0.59, 0.45, 0.02, -0.49, 0.0]	Success
10	✅	[0.4, -0.56, 0.45, -0.03, 0.48, 0.0]	Success

As shown in Table 1, a total of ten simulation runs were conducted, of which nine were executed successfully. In a single instance, although the cutter ultimately reached the target and completed the operation successfully, the generated path was unnecessarily long and not the most optimal. While this run met the success criteria in simulation, such a trajectory could present potential risks in a real-world deployment and is therefore noted here for completeness.

The target success rate for the Spring Validation Demo (SVD) was set to 40%. However, the system significantly outperformed expectations, achieving a 100% success rate across all test runs.

2. Perception Subsystem Results

The perception subsystem is responsible for accurately detecting and segmenting both peppers and peduncles to enable reliable downstream grasping and cutting actions. Two dedicated models were trained and evaluated for this purpose: one for pepper segmentation and the other for peduncle segmentation. Performance evaluation was conducted using the BUP20 dataset, as shown in Figure 1, with key metrics summarized in Table 2. The segmentation models for both pepper and peduncle were quantitatively evaluated using mAP@50, Precision, and Recall.

Figure 1: Pepper Segmentation Results on BUP20 dataset (left) and Peduncle Segmentation Results (right)

Table 2: Quantitative Evaluation of Pepper and Peduncle Segmentation Models

Metric	Pepper	Peduncle
mAP@50	92.5%	92.6%
Precision	94.1%	96.26%
Recall	86.0%	85.75%

Overall, both models delivered reliable segmentation performance sufficient for SVD conditions, with future work aimed at improving recall for challenging scenarios.

3. Single Real Arm Planning Subsystem Results

For the hardware demonstration, a single xArm7 robotic arm was utilized, as the second arm had not yet been provided by the project sponsor. The available arm was designated as the gripper unit, responsible for grasping the bell peppers using a soft gripper mounted at its end-effector and depositing them into the storage bin. In the absence of the pruner arm, the peduncle was detached manually to complete the demonstration workflow. The test environment was configured with four artificial plants, each bearing multiple green bell peppers. The peppers were fully visible and positioned with varying orientations, constrained within 30 degrees of rotation to introduce controlled variability. The outcomes of the experiments conducted under these conditions are summarized in Table 3.

Table 3 : Results of Validation Runs with Hardware Setup

Sl No.	Outcome	Notes
1.	✅
2	❌	Planning – Failed to find valid grasp plan
3	✅
4	✅
5	❌	Perception – Fine Pose estimate incorrect
6	✅
7	✅
8	❌	Planning – Incorrect Grasp Planning
9	❌	Misc – Grasp Failure due to plastic leaves
10	❌	Perception – False Positive Segmentation

As shown in Table 3, the validation runs had two failures related to the perception subsystem. One instance involved the generation of a false segmentation mask in a region devoid of peppers, while the other resulted from an inaccurate fine pose estimation, which prevented the gripper from successfully planning a valid pre-grasp to grasp trajectory. Additionally, two failures were observed in the motion planning subsystem. These were primarily caused by planned paths exceeding virtual workspace boundaries, as well as by the generation of trajectories leading to invalid joint configurations.

The target success criterion for SVD was defined as a 25% success rate. In practice, the system achieved successful harvesting in 3.5 out of 6 attempts (the partial success is due to a case where the pepper was not securely grasped, yet was still placed into the storage bin). Accordingly, the overall demonstration objectives for the SVD exceeded the target goal.

4. SVD Quantitative Results

The quantitative results of our system corresponding to the performance requirements are summarized in Table 10. The data clearly indicate that all specified requirements were met, with performance exceeding target thresholds by a substantial margin.

Table 4 : Quantitative Metrics of SVD Results

Functional Requirement	ID	Performance Requirement	Desired	Actual
Identify, Localize and Prioritize Green Peppers	PR.01	Detect fully visible Peppers > 70% of the time	70%	86%
Identify, Localize and Prioritize Green Peppers	PR.02	Produce estimate pose of green pepper and peduncle within 3 cm of ground truth depth and within 2 cm of other coordinates and upto 30 degrees in each rotation axis	3 cm	2.4 cm
Harvest Green Pepper	PR.04	Reach target green peppers > 70% of the time	70%	100% (Sim) 80% (Real)
Minimize Green Pepper Damage	PR.08	Avoids deformation and damage to 90% of picked green peppers	90%	95%
Overall Success	PR.09	Avoid visible damage to harvested green pepper > 90% of the time	90%	100%
Overall Success	PR.10	Harvest a fully visible green pepper in testbed autonomously within 100 seconds	100s	68s (1Arm) 73s (2Arms)

1. Bimanual Simulation Results

2. Perception Subsystem Results

3. Single Real Arm Planning Subsystem Results

4. SVD Quantitative Results

VADER