Skip to main content

Pi0 and Pi0.5 Model Fine-tuning: Official Workflow Based on OpenPI

Pi0 and Pi0.5 are Vision-Language-Action (VLA) models developed by Physical Intelligence. If you are planning to use data exported from the EmbodyFlow platform to fine-tune these models, this guide will walk you through the complete process based on the official OpenPI framework.

Why choose OpenPI instead of the LeRobot framework?

While LeRobot supports multiple mainstream models including Pi0, for the Pi0 series, we strongly recommend using the official OpenPI training framework.
It is developed based on JAX, natively supports multi-card high-performance training, and can better unleash the potential of Pi0.


1. Preparation: Exporting and Placing Data

First, we need to convert the annotated data on the EmbodyFlow platform into a format that OpenPI can recognize.

Export Process

  1. Format Selection: On the export page, select the LeRobot v2.1 standard format. export lerobot v2.1
  2. Local Extraction: Download the generated .tar.gz file and extract it.
  3. Directory Convention: To allow OpenPI to find the data smoothly, please move it to the Hugging Face local cache directory. For example:
    # Prepare directory
    mkdir -p ~/.cache/huggingface/lerobot/local/mylerobot

    # Move extracted files (containing meta.json, data/, etc.) into it
    mv /path/to/extracted/data/* ~/.cache/huggingface/lerobot/local/mylerobot/

Field Mapping Reference (Taking Aloha Three-Camera as an example)

In subsequent configurations, you need to ensure that the keys in the code align with the fields in the data. By default, we recommend using:

  • cam_high: Top view
  • cam_left_wrist: Left wrist view
  • cam_right_wrist: Right wrist view
  • state: Current robot state
  • action: Target action (Note: ALOHA is 14-dimensional in the default OpenPI configuration. If your data dimensionality is different, be sure to refer to the "Deep Pitfalls" section below).

2. Core: How to Choose the Right Training Configuration?

OpenPI's training logic is highly templated. When choosing a configuration, you are essentially selecting a "policy template closest to your robot" and then fine-tuning it.

Your Requirement ScenarioRecommended PathKey Focus
Quick Validation / Linkage TroubleshootingSimulation Path (LIBERO / ALOHA Sim)Focus on quick alignment of Inputs/Outputs, lowest cost.
Real Robot Deployment (Dual-arm Aloha)ALOHA RealMust align camera keys, action dimensionality, and gripper control logic.
Single-arm / Industrial RobotRefer to UR5 ExamplePrioritize solving control interface compatibility, then consider training effectiveness.
Pursuing Extreme GeneralizationBased on DROID Data AlignmentLearn DROID's normalization (Norm Stats) strategy.

Simply put: If it's your first time, use the simulation configuration to troubleshoot the process; if you want to deploy on a real machine, choose ALOHA Real and strictly align state/action dimensions.


Technical Pitfall: About 14-dim vs 16-dim Action Vectors

This is a very easy-to-overlook "pitfall." OpenPI's default ALOHA policy (aloha_policy.py) hard-codes a 14-dimensional structure:

  • Default Structure: [left_arm_6_joints, left_gripper_1, right_arm_6_joints, right_gripper_1] = 14 dimensions.
  • Common Problem: If you are using a 7-axis robotic arm (e.g., [7, 1, 7, 1]), the total dimension becomes 16. At this point, if the code is not modified, the extra dimensions will be silently truncated, causing the trained model to completely lose control of the last two joints.

Modification Suggestions:

  1. Check your action vector definition.
  2. In aloha_policy.py, change all :14 slices to your actual dimension (e.g., :16).
  3. Simultaneously modify the length of _joint_flip_mask to ensure the sign reversal logic matches your hardware.

3. Writing Training Configuration

Now, we need to define your fine-tuning task in openpi/src/openpi/training/config.py.

# Example: Adding a custom configuration for your robot
TrainConfig(
name="pi0_aloha_mylerobot",
model=pi0_config.Pi0Config(),
data=LeRobotAlohaDataConfig(
repo_id="local/mylerobot", # Points to the directory where data was placed earlier
assets=AssetsConfig(
assets_dir="/home/user/code/openpi/assets/pi0_aloha_mylerobot",
),
default_prompt="fold the clothes", # Task description, very important
repack_transforms=_transforms.Group(
inputs=[
_transforms.RepackTransform(
{
"images": {
"cam_high": "observation.images.cam_high",
"cam_left_wrist": "observation.images.cam_left_wrist",
"cam_right_wrist": "observation.images.cam_right_wrist",
},
"state": "observation.state",
"actions": "action",
}
)
]
),
),
weight_loader=weight_loaders.CheckpointWeightLoader("gs://openpi-assets/checkpoints/pi0_base/params"),
num_train_steps=20_000,
)

4. Starting Training: Computing Statistics and Running

Before starting training, be sure to perform normalization statistics. Otherwise, the range of values received by the model will be chaotic.

Step 1: Compute Norm Stats

uv run scripts/compute_norm_stats.py --config-name pi0_aloha_mylerobot

Step 2: Start Fine-tuning

We recommend using JAX mode for maximum performance.

Single GPU Mode:

export XLA_PYTHON_CLIENT_MEM_FRACTION=0.9
CUDA_VISIBLE_DEVICES=0 uv run scripts/train.py pi0_aloha_mylerobot \
--exp-name=my_first_experiment \
--overwrite

Multi-GPU Parallel (FSDP):

uv run scripts/train.py pi0_aloha_mylerobot --exp-name=multi_gpu_run --fsdp-devices 4

5. Inference and Deployment

After fine-tuning is complete, you can start the policy server to let the robot "run."

# Start inference server, default port 8000
uv run scripts/serve_policy.py policy:checkpoint \
--policy.config=pi0_aloha_mylerobot \
--policy.dir=experiments/my_first_experiment/checkpoints/last

6. FAQ

  • Q: Out of Memory (OOM)?
    • Decrease batch_size, or confirm if XLA_PYTHON_CLIENT_MEM_FRACTION is set correctly. Multi-card FSDP is also an effective means to alleviate VRAM pressure.
  • Q: Model movements are strange, or joints don't move at all?
    • Check if the mapping in RepackTransform is correct.
    • Highly recommend revisiting the 14-dim vs 16-dim deep pitfall section above and check if dimensions are being silently truncated.
  • Q: Training Loss doesn't decrease at all?
    • Check if default_prompt is accurate.
    • Confirm if the statistics file generated by compute_norm_stats is taking effect.

Further Resources