Skip to main content

LeRobot datasets and training

LeRobot is Hugging Face’s framework for robot learning data and policy training. On the IO-AI data platform, you can export annotated robot data as reusable LeRobot datasets, then fine-tune them with LeRobot, OpenPI, ACT, Spirit-v1.5, or other stacks.

This page covers export and validation, then a training overview. Before using the official LeRobot training stack, read Install LeRobot. Model-specific commands are in the dedicated guides.

Exporting data

The platform can export annotated data in LeRobot format. Before export, make sure the following are stable:

  • observation.images.*: camera images or video; keep camera names consistent across the project.
  • observation.state: robot state, usually joint or end-effector state.
  • action: control commands, strictly time-aligned with observation.state.
  • task: natural-language task description. VLA models usually depend on this field; if it is missing, some training images allow a --prompt fallback.

Export settings

Suggested export parameters by training goal:

ParameterRecommendation
Sampling rateKeep the robot control rate first; 10–30 Hz is common. Avoid aggressive downsampling just to shrink size.
Image formatPrefer MP4 for large-scale training; use JPG temporarily when debugging image quality.
Strict I/O matchEnable to reduce misalignment between observations, language, and actions.
Face blurEnable when people appear on camera and data will be distributed externally.

After export, download the .tar.gz and extract to a dedicated folder. Training paths such as /path/to/lerobot_dataset must point at the directory that contains meta/info.json, not its parent.

your_dataset/
├── meta/
│ └── info.json
├── data/
└── videos/

Topic mapping

The platform infers state and action fields from ROS/ROS2 topic names:

  • Topics ending with /joint_state or /joint_states: their position is written to observation.state.
  • Topics ending with /joint_cmd or /joint_command: their position is written to action.

If your project uses different names, normalize them at recording time. If existing data cannot be renamed, plan a mapping layer early—do not wait until training fails.

Data validation

Validate at least once before training. Prefer LeRobot Studio to open the archive or extracted folder; it checks meta/info.json, features, episodes, tabular files, and videos for basic consistency.

For scripted local checks, load the dataset with LeRobot:

from lerobot.datasets import LeRobotDataset

dataset = LeRobotDataset("local/my_dataset", root="/path/to/lerobot_dataset")
print(dataset.num_frames, dataset.num_episodes)

Successful loading does not guarantee the data is ideal for training, but it rules out wrong directory layout, missing metadata, and missing files.

Training paths

These guides follow: official source and model cards define behavior and limits; images are for reproducing the pipeline quickly. In practice, start with IO-AI’s published ioaitech/* images to close the training loop, then switch to upstream source when you need deeper changes.

ModelRecommended entryTypical useGuide
Pi0 / Pi0.5ioaitech/train_openpi:pi0, ioaitech/train_openpi:pi05Large VLA fine-tuning; OpenPI base weights and JAX/FSDP/LoRA stackPi0 and Pi0.5
SmolVLAioaitech/lerobot-gpu:v0.5.0Single-GPU baseline; official LeRobot v0.5.0 commandsSmolVLA
ACTioaitech/train_act:cudaSingle-task imitation; LeRobot data bridged to ACT/HDF5ACT
Spirit-v1.5ioaitech/train_spirit:1.5Frontier VLA; RoboChallenge layout or convertible LeRobot dataSpirit-v1.5
Diffusion Policyioaitech/lerobot-gpu:v0.5.0 or upstream sourceSmooth continuous action trajectoriesDiffusion Policy

Pre-flight checks

All GPU images need the NVIDIA Container Toolkit. Confirm the GPU is visible inside a container:

docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi

Suggested host layout:

workspace/
├── dataset/ # extracted LeRobot dataset
└── output/ # checkpoints, logs, manifests

LeRobot v0.5.0 training entry point

After installing LeRobot v0.5.0, use the lerobot-train CLI. All LeRobot-framework training in these docs assumes that command.

Minimal example:

lerobot-train \
--dataset.repo_id=local/my_dataset \
--dataset.root=/path/to/lerobot_dataset \
--policy.type=act \
--output_dir=outputs/train/act_baseline \
--job_name=act_baseline \
--policy.device=cuda

Fine-tuning from a pretrained policy uses --policy.path, for example SmolVLA:

lerobot-train \
--policy.path=lerobot/smolvla_base \
--dataset.repo_id=local/my_dataset \
--dataset.root=/path/to/lerobot_dataset \
--output_dir=outputs/train/smolvla_finetune \
--job_name=smolvla_finetune \
--policy.device=cuda

Official references

Technical claims in these pages are grounded in the following (URLs were checked when the docs were rewritten):

FAQ

Dataset format vs LeRobot package version

codebase_version in meta/info.json is the dataset format version, not the Python package version.

Dataset formatSuggested training stack
v2.1Often LeRobot v0.3.x; OpenPI/ACT images include compatibility shims.
v3.0Prefer LeRobot v0.4+; current default is v0.5.x.

“Dataset not found” in the container

Check bind mounts. Inside the container you must have /data/input/meta/info.json. Mounting the parent of the dataset root is a common mistake.

Which model to train first

For the fastest closed loop, start with SmolVLA or ACT. Move to Pi0/Pi0.5 or Spirit-v1.5 when you need stronger language-conditioned generalization. Avoid running many large experiments in parallel; fix data, evaluation, and one baseline model first.

How to tell training is useful

Offline loss only shows fit on the training distribution. For robot policies, also measure:

  • Success rate under fixed test initial conditions.
  • Success across object poses, lighting, and backgrounds.
  • Whether motion is continuous at inference, without stalls or limit violations.
  • Whether failures cluster on specific scenes or camera views.

When to switch from the image to upstream source

Prefer upstream source when you need to:

  • Change data mapping, action space, or robot I/O.
  • Change model structure, freezing, or optimizers.
  • Reproduce a paper or official benchmark exactly.
  • Debug issues below the image wrapper (framework bugs, driver issues).