Spring Validation Demo – Performance Evaluation
1. Bimanual Simulation Results
In the Gazebo simulation environment, a dual-arm robotic system was configured, with one arm designated for the gripper and the other equipped with a pruner as its end-effector. The bell pepper was initialized in a fully visible pose, and the workspace was made free of obstacles to closely approximate the conditions we had defined for the SVD demo. Following extensive trials and validation runs, the results from ten executions are summarized in Table 1.
Table 1: Results of Validation Runs in Simulation
No. | Result | Simulated Pose (xyz, roll pitch yaw) | Notes |
1. | ✅ | [0.28, -0.46, 0.53, 0.26, -0.26, 0.0] | Success |
2 | ✅ | [0.45, -0.63, 0.48, 0.38, 0.43, 0.0] | Success |
3 | ⚠️ | [0.3, -0.5, 0.57, -0.46, 0.37, 0.0] | Success, but cutter had risky pathing |
4 | ✅ | [0.34, -0.56, 0.57, 0.13, 0.18, 0.0] | Success |
5 | ✅ | [0.31, -0.46, 0.46, -0.01, 0.46, 0.0] | Success |
6 | ✅ | [0.27, -0.46, 0.49, -0.04, -0.3, 0.0] | Success |
7 | ✅ | [0.27, -0.63, 0.49, -0.33, 0.27, 0.0] | Success |
8 | ✅ | [0.25, -0.54, 0.45, -0.27, -0.27, 0.0] | Success |
9 | ✅ | [0.26, -0.59, 0.45, 0.02, -0.49, 0.0] | Success |
10 | ✅ | [0.4, -0.56, 0.45, -0.03, 0.48, 0.0] | Success |
As shown in Table 1, a total of ten simulation runs were conducted, of which nine were executed successfully. In a single instance, although the cutter ultimately reached the target and completed the operation successfully, the generated path was unnecessarily long and not the most optimal. While this run met the success criteria in simulation, such a trajectory could present potential risks in a real-world deployment and is therefore noted here for completeness.
The target success rate for the Spring Validation Demo (SVD) was set to 40%. However, the system significantly outperformed expectations, achieving a 100% success rate across all test runs.
2. Perception Subsystem Results
The perception subsystem is responsible for accurately detecting and segmenting both peppers and peduncles to enable reliable downstream grasping and cutting actions. Two dedicated models were trained and evaluated for this purpose: one for pepper segmentation and the other for peduncle segmentation. Performance evaluation was conducted using the BUP20 dataset, as shown in Figure 1, with key metrics summarized in Table 2. The segmentation models for both pepper and peduncle were quantitatively evaluated using mAP@50, Precision, and Recall.

Figure 1: Pepper Segmentation Results on BUP20 dataset (left) and Peduncle Segmentation Results (right)
Table 2: Quantitative Evaluation of Pepper and Peduncle Segmentation Models
Metric | Pepper | Peduncle |
mAP@50 | 92.5% | 92.6% |
Precision | 94.1% | 96.26% |
Recall | 86.0% | 85.75% |
Overall, both models delivered reliable segmentation performance sufficient for SVD conditions, with future work aimed at improving recall for challenging scenarios.
3. Single Real Arm Planning Subsystem Results
For the hardware demonstration, a single xArm7 robotic arm was utilized, as the second arm had not yet been provided by the project sponsor. The available arm was designated as the gripper unit, responsible for grasping the bell peppers using a soft gripper mounted at its end-effector and depositing them into the storage bin. In the absence of the pruner arm, the peduncle was detached manually to complete the demonstration workflow. The test environment was configured with four artificial plants, each bearing multiple green bell peppers. The peppers were fully visible and positioned with varying orientations, constrained within 30 degrees of rotation to introduce controlled variability. The outcomes of the experiments conducted under these conditions are summarized in Table 3.
Table 3 : Results of Validation Runs with Hardware Setup
Sl No. | Outcome | Notes |
1. | ✅ | |
2 | ❌ | Planning – Failed to find valid grasp plan |
3 | ✅ | |
4 | ✅ | |
5 | ❌ | Perception – Fine Pose estimate incorrect |
6 | ✅ | |
7 | ✅ | |
8 | ❌ | Planning – Incorrect Grasp Planning |
9 | ❌ | Misc – Grasp Failure due to plastic leaves |
10 | ❌ | Perception – False Positive Segmentation |
As shown in Table 3, the validation runs had two failures related to the perception subsystem. One instance involved the generation of a false segmentation mask in a region devoid of peppers, while the other resulted from an inaccurate fine pose estimation, which prevented the gripper from successfully planning a valid pre-grasp to grasp trajectory. Additionally, two failures were observed in the motion planning subsystem. These were primarily caused by planned paths exceeding virtual workspace boundaries, as well as by the generation of trajectories leading to invalid joint configurations.
The target success criterion for SVD was defined as a 25% success rate. In practice, the system achieved successful harvesting in 3.5 out of 6 attempts (the partial success is due to a case where the pepper was not securely grasped, yet was still placed into the storage bin). Accordingly, the overall demonstration objectives for the SVD exceeded the target goal.
4. SVD Quantitative Results
The quantitative results of our system corresponding to the performance requirements are summarized in Table 10. The data clearly indicate that all specified requirements were met, with performance exceeding target thresholds by a substantial margin.
Table 4 : Quantitative Metrics of SVD Results
Functional Requirement | ID | Performance Requirement | Desired | Actual |
Identify, Localize and Prioritize Green Peppers | PR.01 | Detect fully visible Peppers > 70% of the time | 70% | 86% |
PR.02 | Produce estimate pose of green pepper and peduncle within 3 cm of ground truth depth and within 2 cm of other coordinates and upto 30 degrees in each rotation axis | 3 cm | 2.4 cm | |
Harvest Green Pepper | PR.04 | Reach target green peppers > 70% of the time | 70% | 100% (Sim) 80% (Real) |
Minimize Green Pepper Damage | PR.08 | Avoids deformation and damage to 90% of picked green peppers | 90% | 95% |
Overall Success | PR.09 | Avoid visible damage to harvested green pepper > 90% of the time | 90% | 100% |
PR.10 | Harvest a fully visible green pepper in testbed autonomously within 100 seconds | 100s | 68s (1Arm) 73s (2Arms) |