Fall Validation Demonstration – Performance Evaluation
FVD and FVD Encore, Fall 2025
The system performance was evaluated against the objectives and verification criteria outlined in the FVD. The primary goals included demonstrating fully functional autonomous harvesting with specific success rates, dual-arm teleoperation, and a segregated user interface.
During the initial FVD, the system demonstrated robust performance. In the first run, the system autonomously harvested 4 out of 5 peppers and successfully harvested the 5th via teleoperation. In the second run, despite some challenges with string cutting and positioning, the system achieved a net success rate of over 60%, satisfying the primary validation criteria.
During the FVD Encore, performance improved further. The system harvested 3 out of 6 peppers on the first run and 5 out of 6 on the second run. The system met the strict timing specifications, harvesting peppers within the allocated 100-second window for autonomous operation. The teleoperation functionality was also verified, with the operator successfully harvesting peppers within the 3-minute limit. Table 1 summarizes the FVD performance against the requirements. Figure 1 shows a bar chart showing the average time taken by each of the tasks within the cycle time of harvesting one pepper averaged over 48 runs.
Table 1: Summary of FVD Performance against Requirements
| FVD Verification Criterion | Success Metric | Requirements |
| Autonomous Harvest Success Rate | Overall, at least 60% of the green peppers attempted by the autonomous mode should be harvested successfully. | PR.04: Reach target green peppers >70%PR.05: Cut identified pepper peduncle >60%PR.06: Store harvested green pepper >80% |
| Autonomous Harvest Speed | Out of successfully harvested peppers, at least one pepper is harvested within 100 seconds from initial arm homing to storage. | PR.10: Harvest a fully visible green pepper in testbed autonomously within 100 seconds |
| Max Harvest Duration | Each pepper should take no longer than 3 minutes to harvest or be skipped. | FR.03: Plan to reach for peppersPR.03: Plan paths within deviation limits |
| Teleoperation Success | In teleoperation mode, the user must be able to successfully harvest at least one pepper and place it within the storage bin within 3 minutes. | FR.06: Adapt user inputs to Teleoperation Output MovementsPR.07: Ensure cumulative joint angle error is minimized |
| Workspace Reachability | Peppers within the valid workspace (0.3m~0.7m off ground, 0.8m~1.0m from base) were attempted. | NFR.01: Reach green peppers as far as 70 cm awayFR.02: Localize and Prioritize PeppersTable 1: Results of Validation Runs in Simulation |

Figure 1: Task-wise Time Breakdown
Spring Validation Demo – Performance Evaluation
SVD and SVD Encore, Spring 2025
1. Bimanual Simulation Results
In the Gazebo simulation environment, a dual-arm robotic system was configured, with one arm designated for the gripper and the other equipped with a pruner as its end-effector. The bell pepper was initialized in a fully visible pose, and the workspace was made free of obstacles to closely approximate the conditions we had defined for the SVD demo. Following extensive trials and validation runs, the results from ten executions are summarized in Table 2.
Table 2: Results of Validation Runs in Simulation
| No. | Result | Simulated Pose (xyz, roll pitch yaw) | Notes |
| 1. | ✅ | [0.28, -0.46, 0.53, 0.26, -0.26, 0.0] | Success |
| 2 | ✅ | [0.45, -0.63, 0.48, 0.38, 0.43, 0.0] | Success |
| 3 | ⚠️ | [0.3, -0.5, 0.57, -0.46, 0.37, 0.0] | Success, but cutter had risky pathing |
| 4 | ✅ | [0.34, -0.56, 0.57, 0.13, 0.18, 0.0] | Success |
| 5 | ✅ | [0.31, -0.46, 0.46, -0.01, 0.46, 0.0] | Success |
| 6 | ✅ | [0.27, -0.46, 0.49, -0.04, -0.3, 0.0] | Success |
| 7 | ✅ | [0.27, -0.63, 0.49, -0.33, 0.27, 0.0] | Success |
| 8 | ✅ | [0.25, -0.54, 0.45, -0.27, -0.27, 0.0] | Success |
| 9 | ✅ | [0.26, -0.59, 0.45, 0.02, -0.49, 0.0] | Success |
| 10 | ✅ | [0.4, -0.56, 0.45, -0.03, 0.48, 0.0] | Success |
As shown in Table 2, a total of ten simulation runs were conducted, of which nine were executed successfully. In a single instance, although the cutter ultimately reached the target and completed the operation successfully, the generated path was unnecessarily long and not the most optimal. While this run met the success criteria in simulation, such a trajectory could present potential risks in a real-world deployment and is therefore noted here for completeness.
The target success rate for the Spring Validation Demo (SVD) was set to 40%. However, the system significantly outperformed expectations, achieving a 100% success rate across all test runs.
2. Perception Subsystem Results
The perception subsystem is responsible for accurately detecting and segmenting both peppers and peduncles to enable reliable downstream grasping and cutting actions. Two dedicated models were trained and evaluated for this purpose: one for pepper segmentation and the other for peduncle segmentation. Performance evaluation was conducted using the BUP20 dataset, as shown in Figure 2, with key metrics summarized in Table 3. The segmentation models for both pepper and peduncle were quantitatively evaluated using mAP@50, Precision, and Recall.

Figure 2: Pepper Segmentation Results on BUP20 dataset (left) and Peduncle Segmentation Results (right)
Table 3: Quantitative Evaluation of Pepper and Peduncle Segmentation Models
| Metric | Pepper | Peduncle |
| mAP@50 | 92.5% | 92.6% |
| Precision | 94.1% | 96.26% |
| Recall | 86.0% | 85.75% |
Overall, both models delivered reliable segmentation performance sufficient for SVD conditions, with future work aimed at improving recall for challenging scenarios.
3. Single Real Arm Planning Subsystem Results
For the hardware demonstration, a single xArm7 robotic arm was utilized, as the second arm had not yet been provided by the project sponsor. The available arm was designated as the gripper unit, responsible for grasping the bell peppers using a soft gripper mounted at its end-effector and depositing them into the storage bin. In the absence of the pruner arm, the peduncle was detached manually to complete the demonstration workflow. The test environment was configured with four artificial plants, each bearing multiple green bell peppers. The peppers were fully visible and positioned with varying orientations, constrained within 30 degrees of rotation to introduce controlled variability. The outcomes of the experiments conducted under these conditions are summarized in Table 4.
Table 4 : Results of Validation Runs with Hardware Setup
| Sl No. | Outcome | Notes |
| 1. | ✅ | |
| 2 | ❌ | Planning – Failed to find valid grasp plan |
| 3 | ✅ | |
| 4 | ✅ | |
| 5 | ❌ | Perception – Fine Pose estimate incorrect |
| 6 | ✅ | |
| 7 | ✅ | |
| 8 | ❌ | Planning – Incorrect Grasp Planning |
| 9 | ❌ | Misc – Grasp Failure due to plastic leaves |
| 10 | ❌ | Perception – False Positive Segmentation |
As shown in Table 4, the validation runs had two failures related to the perception subsystem. One instance involved the generation of a false segmentation mask in a region devoid of peppers, while the other resulted from an inaccurate fine pose estimation, which prevented the gripper from successfully planning a valid pre-grasp to grasp trajectory. Additionally, two failures were observed in the motion planning subsystem. These were primarily caused by planned paths exceeding virtual workspace boundaries, as well as by the generation of trajectories leading to invalid joint configurations.
The target success criterion for SVD was defined as a 25% success rate. In practice, the system achieved successful harvesting in 3.5 out of 6 attempts (the partial success is due to a case where the pepper was not securely grasped, yet was still placed into the storage bin). Accordingly, the overall demonstration objectives for the SVD exceeded the target goal.
4. SVD Quantitative Results
The quantitative results of our system corresponding to the performance requirements are summarized in Table 5. The data clearly indicate that all specified requirements were met, with performance exceeding target thresholds by a substantial margin.
Table 5 : Quantitative Metrics of SVD Results
| Functional Requirement | ID | Performance Requirement | Desired | Actual |
| Identify, Localize and Prioritize Green Peppers | PR.01 | Detect fully visible Peppers > 70% of the time | 70% | 86% |
| PR.02 | Produce estimate pose of green pepper and peduncle within 3 cm of ground truth depth and within 2 cm of other coordinates and upto 30 degrees in each rotation axis | 3 cm | 2.4 cm | |
| Harvest Green Pepper | PR.04 | Reach target green peppers > 70% of the time | 70% | 100% (Sim) 80% (Real) |
| Minimize Green Pepper Damage | PR.08 | Avoids deformation and damage to 90% of picked green peppers | 90% | 95% |
| Overall Success | PR.09 | Avoid visible damage to harvested green pepper > 90% of the time | 90% | 100% |
| PR.10 | Harvest a fully visible green pepper in testbed autonomously within 100 seconds | 100s | 68s (1Arm) 73s (2Arms) |