Kinematics Lab/GR00T VLA pick-and-place

VLAHardware: Max10 Hz inference · TensorRT FP16

GR00T VLA pick-and-place

Language-conditioned manipulation with NVIDIA GR00T N1.5.

About this demo

Speak the task ('pick up the red block, put it in the bowl'); a vision-language-action policy plans the grasp, executes the trajectory, and reports success. Runs on Jetson Thor-class (e.g., T4000/T5000) compute with TensorRT-optimized inference. Reference implementation: SIGRobotics' Matcha Bot, embodied-AI hackathon winner.

Highlights

→GR00T N1.5 backbone
→Natural-language task prompting
→Closed-loop visual servoing
→Real-time grasp validation

Supported robots

Franka PandaUR5eSO-101 dual-arm

Related demos

OpenVLA grasping

Open-source VLA model for manipulation — no proprietary checkpoints.

Open demo →VLA

VLM agent in Isaac Sim

Vision-language reasoning over a simulated robotics scene.

Open demo →Teleop

CockPit

Predefined teleop dashboard with dual cameras, 3D model, map, and controls.

Cookie Settings

We use cookies to analyse site traffic and personalise content. Read our Cookie Policy for details.