LeRobot datasets and training
LeRobot is Hugging Face’s framework for robot learning data and policy training. On the IO-AI data platform, you can export annotated robot data as reusable LeRobot datasets, then fine-tune them with LeRobot, OpenPI, ACT, Spirit-v1.5, or other stacks.
This page covers export and validation, then a training overview. Before using the official LeRobot training stack, read Install LeRobot. Model-specific commands are in the dedicated guides.
Exporting data
The platform can export annotated data in LeRobot format. Before export, make sure the following are stable:
observation.images.*: camera images or video; keep camera names consistent across the project.observation.state: robot state, usually joint or end-effector state.action: control commands, strictly time-aligned withobservation.state.task: natural-language task description. VLA models usually depend on this field; if it is missing, some training images allow a--promptfallback.
Export settings
Suggested export parameters by training goal:
| Parameter | Recommendation |
|---|---|
| Sampling rate | Keep the robot control rate first; 10–30 Hz is common. Avoid aggressive downsampling just to shrink size. |
| Image format | Prefer MP4 for large-scale training; use JPG temporarily when debugging image quality. |
| Strict I/O match | Enable to reduce misalignment between observations, language, and actions. |
| Face blur | Enable when people appear on camera and data will be distributed externally. |
After export, download the .tar.gz and extract to a dedicated folder. Training paths such as /path/to/lerobot_dataset must point at the directory that contains meta/info.json, not its parent.
your_dataset/
├── meta/
│ └── info.json
├── data/
└── videos/
Topic mapping
The platform infers state and action fields from ROS/ROS2 topic names:
- Topics ending with
/joint_stateor/joint_states: theirpositionis written toobservation.state. - Topics ending with
/joint_cmdor/joint_command: theirpositionis written toaction.
If your project uses different names, normalize them at recording time. If existing data cannot be renamed, plan a mapping layer early—do not wait until training fails.
Data validation
Validate at least once before training. Prefer LeRobot Studio to open the archive or extracted folder; it checks meta/info.json, features, episodes, tabular files, and videos for basic consistency.
For scripted local checks, load the dataset with LeRobot:
from lerobot.datasets import LeRobotDataset
dataset = LeRobotDataset("local/my_dataset", root="/path/to/lerobot_dataset")
print(dataset.num_frames, dataset.num_episodes)
Successful loading does not guarantee the data is ideal for training, but it rules out wrong directory layout, missing metadata, and missing files.
Training paths
These guides follow: official source and model cards define behavior and limits; images are for reproducing the pipeline quickly. In practice, start with IO-AI’s published ioaitech/* images to close the training loop, then switch to upstream source when you need deeper changes.
| Model | Recommended entry | Typical use | Guide |
|---|---|---|---|
| Pi0 / Pi0.5 | ioaitech/train_openpi:pi0, ioaitech/train_openpi:pi05 | Large VLA fine-tuning; OpenPI base weights and JAX/FSDP/LoRA stack | Pi0 and Pi0.5 |
| SmolVLA | ioaitech/lerobot-gpu:v0.5.0 | Single-GPU baseline; official LeRobot v0.5.0 commands | SmolVLA |
| ACT | ioaitech/train_act:cuda | Single-task imitation; LeRobot data bridged to ACT/HDF5 | ACT |
| Spirit-v1.5 | ioaitech/train_spirit:1.5 | Frontier VLA; RoboChallenge layout or convertible LeRobot data | Spirit-v1.5 |
| Diffusion Policy | ioaitech/lerobot-gpu:v0.5.0 or upstream source | Smooth continuous action trajectories | Diffusion Policy |
Pre-flight checks
All GPU images need the NVIDIA Container Toolkit. Confirm the GPU is visible inside a container:
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
Suggested host layout:
workspace/
├── dataset/ # extracted LeRobot dataset
└── output/ # checkpoints, logs, manifests
LeRobot v0.5.0 training entry point
After installing LeRobot v0.5.0, use the lerobot-train CLI. All LeRobot-framework training in these docs assumes that command.
Minimal example:
lerobot-train \
--dataset.repo_id=local/my_dataset \
--dataset.root=/path/to/lerobot_dataset \
--policy.type=act \
--output_dir=outputs/train/act_baseline \
--job_name=act_baseline \
--policy.device=cuda
Fine-tuning from a pretrained policy uses --policy.path, for example SmolVLA:
lerobot-train \
--policy.path=lerobot/smolvla_base \
--dataset.repo_id=local/my_dataset \
--dataset.root=/path/to/lerobot_dataset \
--output_dir=outputs/train/smolvla_finetune \
--job_name=smolvla_finetune \
--policy.device=cuda
Official references
Technical claims in these pages are grounded in the following (URLs were checked when the docs were rewritten):
- LeRobot
v0.5.0repo, install guide, Docker: github.com/huggingface/lerobot, installation, docker. - SmolVLA: LeRobot v0.5.0 SmolVLA, lerobot/smolvla_base.
- OpenPI: Physical-Intelligence/openpi.
- ACT: tonyzhaozh/act, arXiv:2304.13705.
- Spirit-v1.5: Spirit-AI-Team/spirit-v1.5, Spirit-AI-robotics/Spirit-v1.5.
- Diffusion Policy: diffusion-policy.cs.columbia.edu, LeRobot diffusion policy.
FAQ
Dataset format vs LeRobot package version
codebase_version in meta/info.json is the dataset format version, not the Python package version.
| Dataset format | Suggested training stack |
|---|---|
v2.1 | Often LeRobot v0.3.x; OpenPI/ACT images include compatibility shims. |
v3.0 | Prefer LeRobot v0.4+; current default is v0.5.x. |
“Dataset not found” in the container
Check bind mounts. Inside the container you must have /data/input/meta/info.json. Mounting the parent of the dataset root is a common mistake.
Which model to train first
For the fastest closed loop, start with SmolVLA or ACT. Move to Pi0/Pi0.5 or Spirit-v1.5 when you need stronger language-conditioned generalization. Avoid running many large experiments in parallel; fix data, evaluation, and one baseline model first.
How to tell training is useful
Offline loss only shows fit on the training distribution. For robot policies, also measure:
- Success rate under fixed test initial conditions.
- Success across object poses, lighting, and backgrounds.
- Whether motion is continuous at inference, without stalls or limit violations.
- Whether failures cluster on specific scenes or camera views.
When to switch from the image to upstream source
Prefer upstream source when you need to:
- Change data mapping, action space, or robot I/O.
- Change model structure, freezing, or optimizers.
- Reproduce a paper or official benchmark exactly.
- Debug issues below the image wrapper (framework bugs, driver issues).