LeRobot datasets and training

LeRobot is Hugging Face’s framework for robot learning data and policy training. On the IO-AI data platform, you can export annotated robot data as reusable LeRobot datasets, then fine-tune them with LeRobot, OpenPI, ACT, Spirit-v1.5, or other stacks.

This page covers export and validation, then a training overview. Before using the official LeRobot training stack, read Install LeRobot. Model-specific commands are in the dedicated guides.

Exporting data

The platform can export annotated data in LeRobot format. Before export, make sure the following are stable:

observation.images.*: camera images or video; keep camera names consistent across the project.
observation.state: robot state, usually joint or end-effector state.
action: control commands, strictly time-aligned with observation.state.
task: natural-language task description. VLA models usually depend on this field; if it is missing, some training images allow a --prompt fallback.

Export settings

Suggested export parameters by training goal:

Parameter	Recommendation
Sampling rate	Keep the robot control rate first; 10–30 Hz is common. Avoid aggressive downsampling just to shrink size.
Image format	Prefer MP4 for large-scale training; use JPG temporarily when debugging image quality.
Strict I/O match	Enable to reduce misalignment between observations, language, and actions.
Face blur	Enable when people appear on camera and data will be distributed externally.

After export, download the .tar.gz and extract to a dedicated folder. Training paths such as /path/to/lerobot_dataset must point at the directory that contains meta/info.json, not its parent.

your_dataset/
├── meta/
│   └── info.json
├── data/
└── videos/

Topic mapping

The platform infers state and action fields from ROS/ROS2 topic names:

Topics ending with /joint_state or /joint_states: their position is written to observation.state.
Topics ending with /joint_cmd or /joint_command: their position is written to action.

If your project uses different names, normalize them at recording time. If existing data cannot be renamed, plan a mapping layer early—do not wait until training fails.

Data validation

Validate at least once before training. Prefer LeRobot Studio to open the archive or extracted folder; it checks meta/info.json, features, episodes, tabular files, and videos for basic consistency.

For scripted local checks, load the dataset with LeRobot:

from lerobot.datasets import LeRobotDataset

dataset = LeRobotDataset("local/my_dataset", root="/path/to/lerobot_dataset")
print(dataset.num_frames, dataset.num_episodes)

Successful loading does not guarantee the data is ideal for training, but it rules out wrong directory layout, missing metadata, and missing files.

Training paths

These guides follow: official source and model cards define behavior and limits; images are for reproducing the pipeline quickly. In practice, start with IO-AI’s published ioaitech/* images to close the training loop, then switch to upstream source when you need deeper changes.

Model	Recommended entry	Typical use	Guide
Pi0 / Pi0.5	`ioaitech/train_openpi:pi0`, `ioaitech/train_openpi:pi05`	Large VLA fine-tuning; OpenPI base weights and JAX/FSDP/LoRA stack	Pi0 and Pi0.5
SmolVLA	`ioaitech/lerobot-gpu:v0.5.0`	Single-GPU baseline; official LeRobot `v0.5.0` commands	SmolVLA
ACT	`ioaitech/train_act:cuda`	Single-task imitation; LeRobot data bridged to ACT/HDF5	ACT
Spirit-v1.5	`ioaitech/train_spirit:1.5`	Frontier VLA; RoboChallenge layout or convertible LeRobot data	Spirit-v1.5
Diffusion Policy	`ioaitech/lerobot-gpu:v0.5.0` or upstream source	Smooth continuous action trajectories	Diffusion Policy

Pre-flight checks

All GPU images need the NVIDIA Container Toolkit. Confirm the GPU is visible inside a container:

docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi

Suggested host layout:

workspace/
├── dataset/     # extracted LeRobot dataset
└── output/      # checkpoints, logs, manifests

LeRobot v0.5.0 training entry point

After installing LeRobot v0.5.0, use the lerobot-train CLI. All LeRobot-framework training in these docs assumes that command.

Minimal example:

lerobot-train \
  --dataset.repo_id=local/my_dataset \
  --dataset.root=/path/to/lerobot_dataset \
  --policy.type=act \
  --output_dir=outputs/train/act_baseline \
  --job_name=act_baseline \
  --policy.device=cuda

Fine-tuning from a pretrained policy uses --policy.path, for example SmolVLA:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --dataset.repo_id=local/my_dataset \
  --dataset.root=/path/to/lerobot_dataset \
  --output_dir=outputs/train/smolvla_finetune \
  --job_name=smolvla_finetune \
  --policy.device=cuda

Official references

Technical claims in these pages are grounded in the following (URLs were checked when the docs were rewritten):

LeRobot v0.5.0 repo, install guide, Docker: github.com/huggingface/lerobot, installation, docker.
SmolVLA: LeRobot v0.5.0 SmolVLA, lerobot/smolvla_base.
OpenPI: Physical-Intelligence/openpi.
ACT: tonyzhaozh/act, arXiv:2304.13705.
Spirit-v1.5: Spirit-AI-Team/spirit-v1.5, Spirit-AI-robotics/Spirit-v1.5.
Diffusion Policy: diffusion-policy.cs.columbia.edu, LeRobot diffusion policy.

FAQ

Dataset format vs LeRobot package version

codebase_version in meta/info.json is the dataset format version, not the Python package version.

Dataset format	Suggested training stack
`v2.1`	Often LeRobot `v0.3.x`; OpenPI/ACT images include compatibility shims.
`v3.0`	Prefer LeRobot `v0.4+`; current default is `v0.5.x`.

“Dataset not found” in the container

Check bind mounts. Inside the container you must have /data/input/meta/info.json. Mounting the parent of the dataset root is a common mistake.

Which model to train first

For the fastest closed loop, start with SmolVLA or ACT. Move to Pi0/Pi0.5 or Spirit-v1.5 when you need stronger language-conditioned generalization. Avoid running many large experiments in parallel; fix data, evaluation, and one baseline model first.

How to tell training is useful

Offline loss only shows fit on the training distribution. For robot policies, also measure:

Success rate under fixed test initial conditions.
Success across object poses, lighting, and backgrounds.
Whether motion is continuous at inference, without stalls or limit violations.
Whether failures cluster on specific scenes or camera views.

When to switch from the image to upstream source

Prefer upstream source when you need to:

Change data mapping, action space, or robot I/O.
Change model structure, freezing, or optimizers.
Reproduce a paper or official benchmark exactly.
Debug issues below the image wrapper (framework bugs, driver issues).

Exporting data​

Export settings​

Topic mapping​

Data validation​

Training paths​

Pre-flight checks​

LeRobot v0.5.0 training entry point​

Official references​

FAQ​

Dataset format vs LeRobot package version​

“Dataset not found” in the container​

Which model to train first​

How to tell training is useful​

When to switch from the image to upstream source​