Pi0 and Pi0.5 fine-tuning

Pi0 and Pi0.5 come from Physical Intelligence’s OpenPI project. The official repo Physical-Intelligence/openpi ships pi0_base, pi05_base, and related checkpoints, and treats LeRobot datasets as the main entry for custom fine-tuning. Pi0.5’s current training and inference path in that repo focuses on the flow-matching head; if your task needs fuller hierarchical planning, confirm upstream support before committing.

IO-AI publishes ioaitech/train_openpi:pi0 and ioaitech/train_openpi:pi05 Docker images built on the OpenPI training stack. They bundle the OpenPI runtime, base-weight download logic, normalization-stat computation, and a LeRobot data adapter. Start with an image to close the fine-tuning loop, then return to upstream OpenPI when you need custom data transforms or configs.

When to use

Pi0/Pi0.5 fit high-quality multimodal data and large-model fine-tuning. Compared with ACT or Diffusion Policy, the stack is heavier: JAX, FSDP, LoRA, base weights, norm stats, and language task fields must all be correct.

Model	Image	Base checkpoint	Typical use
Pi0	`ioaitech/train_openpi:pi0`	`gs://openpi-assets/checkpoints/pi0_base`	First OpenPI fine-tuning runs; moderate language generalization needs.
Pi0.5	`ioaitech/train_openpi:pi05`	`gs://openpi-assets/checkpoints/pi05_base`	Stronger open-world generalization when you have data and compute.

Data requirements

The image entrypoint checks /data/input/meta/info.json. The dataset root should look like:

your_dataset/
├── meta/
│   └── info.json
├── data/
└── videos/

The wrapper typically:

Reads the LeRobot dataset schema.
Symlinks the dataset into the in-container LeRobot cache.
For v2.1 data, ensures episodes_stats.jsonl exists when missing.
Works around some Hugging Face parquet metadata edge cases.
Computes OpenPI normalization statistics.
Picks single-GPU LoRA vs multi-GPU FSDP based on GPU count.

Task text comes from the dataset when present; --prompt is only a fallback when the dataset has no task strings.

Training with the image

Confirm the GPU is visible in a container:

docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi

Smoke test

This only checks mounts, data loading, and the training entrypoint:

docker run --rm --gpus all \
  -v /path/to/lerobot_dataset:/data/input \
  -v /path/to/output:/data/output \
  ioaitech/train_openpi:pi0 \
  --batch_size 1 \
  --steps 1000 \
  --save_interval 200

Pi0 fine-tuning

docker run --rm --gpus all \
  -v /path/to/lerobot_dataset:/data/input \
  -v /path/to/output:/data/output \
  ioaitech/train_openpi:pi0 \
  --batch_size 4 \
  --steps 20000 \
  --learning_rate 2.5e-5 \
  --save_interval 1000 \
  --action_horizon 50

Pi0.5 fine-tuning

docker run --rm --gpus all \
  -v /path/to/lerobot_dataset:/data/input \
  -v /path/to/output:/data/output \
  ioaitech/train_openpi:pi05 \
  --batch_size 8 \
  --steps 30000 \
  --learning_rate 2.5e-5 \
  --save_interval 1000 \
  --action_horizon 50

Multi-GPU example:

docker run --rm --gpus all \
  -v /path/to/lerobot_dataset:/data/input \
  -v /path/to/output:/data/output \
  ioaitech/train_openpi:pi05 \
  --gpus 0,1,2,3 \
  --batch_size 16 \
  --steps 30000 \
  --fsdp_devices 4 \
  --save_interval 1000

Default output layout:

/path/to/output/docker_train/train/

This matches the wrapper’s TrainConfig(name="docker_train", exp_name="train").

Common flags

Flag	Default	Meaning
`--batch_size`	`1`	Global batch; adjusted to divide by JAX device count.
`--steps`	`1000`	Training steps.
`--gpus`	`all`	Use all GPUs, or e.g. `0,1`.
`--prompt`	empty	Default instruction when the dataset lacks task text.
`--save_interval`	`500`	Checkpoint save interval.
`--learning_rate`	`2.5e-5`	Peak LR when not overridden.
`--fsdp_devices`	`auto`	Multi-GPU uses visible GPU count; single GPU is `1`.
`--lora`	`auto`	LoRA on by default for single GPU; off by default for multi-GPU FSDP.
`--ema_decay`	empty	EMA disabled under LoRA to save memory.
`--action_horizon`	`50`	Action sequence length.
`--num_workers`	`8`	DataLoader workers.

Reproduce from upstream OpenPI

When you need custom data mapping, model config, or training logic, use upstream OpenPI. The upstream README flow is: install uv, sync deps, define or edit a train config, compute norm stats, then train.

git clone --recurse-submodules https://github.com/Physical-Intelligence/openpi.git
cd openpi

GIT_LFS_SKIP_SMUDGE=1 uv sync
GIT_LFS_SKIP_SMUDGE=1 uv pip install -e .

Official LIBERO-style example:

uv run scripts/compute_norm_stats.py --config-name pi05_libero

XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 \
  uv run scripts/train.py pi05_libero \
  --exp-name=my_experiment \
  --overwrite

For a custom robot, edit src/openpi/training/config.py (data configs and TrainConfig) and implement the required input/output transforms. The IO-AI images ship a generic LeRobot adapter to bootstrap training; deeper mapping changes belong in your own upstream fork or config.

Reproducibility notes

Base weights are baked under /models/openpi-assets/checkpoints/*_base/params at image build time.
W&B and Hugging Face Hub access are disabled by default for offline-friendly runs.
Single-GPU defaults to LoRA; multi-GPU defaults to FSDP. Record --lora, --fsdp_devices, --batch_size, and seeds when comparing runs.
Pi0.5’s upstream README currently emphasizes the flow-matching head—do not assume every internal training stage is exposed.

FAQ

“No LeRobot dataset” at startup

Check the host bind mount to /data/input and that /data/input/meta/info.json exists. Extra directory nesting is the usual issue.

Single-GPU vs multi-GPU differs

Defaults differ (LoRA vs FSDP). Fix --lora, --fsdp_devices, --batch_size, and seeds when isolating model quality.

Manual norm stats

Not required for ioaitech/train_openpi:*—the wrapper computes them. For upstream OpenPI, follow scripts/compute_norm_stats.py first.

Output path looks fixed

The wrapper pins name=docker_train and exp_name=train. Disambiguate experiments on the host path, e.g. /outputs/pi05_pick_block_v1.

When to use​

Data requirements​

Training with the image​

Smoke test​

Pi0 fine-tuning​

Pi0.5 fine-tuning​

Common flags​

Reproduce from upstream OpenPI​

Reproducibility notes​

FAQ​

“No LeRobot dataset” at startup​

Single-GPU vs multi-GPU differs​

Manual norm stats​

Output path looks fixed​

References​