VLAHardware: Max10 Hz inference · TensorRT FP16
GR00T VLA pick-and-place
Language-conditioned manipulation with NVIDIA GR00T N1.5.
About this demo
Speak the task ('pick up the red block, put it in the bowl'); a vision-language-action policy plans the grasp, executes the trajectory, and reports success. Runs on Jetson Thor-class (e.g., T4000/T5000) compute with TensorRT-optimized inference. Reference implementation: SIGRobotics' Matcha Bot, embodied-AI hackathon winner.
Highlights
- →GR00T N1.5 backbone
- →Natural-language task prompting
- →Closed-loop visual servoing
- →Real-time grasp validation
Supported robots
Franka PandaUR5eSO-101 dual-arm