Model Training
The Embodiflow Data Platform provides comprehensive robot learning model training capabilities, supporting end-to-end workflows from data preprocessing to model deployment. The platform integrates various mainstream robot learning algorithms, providing researchers and developers with an efficient model training environment.
Product Features
Flexible Architecture
The product adopts a layered architecture design to ensure system scalability. Training computing power supports multiple choices:
- Private Cloud: Use local data center GPU servers (support multi-GPU parallel training)
- Public Cloud: On-demand rental of cloud service provider computing resources (billed by actual training duration)
From Data to Model
The platform covers the complete data pipeline from data collection, annotation, export, training fine-tuning, to model deployment.
Supported Model Types
The platform supports mainstream learning models in the robotics field, covering vision-language-action fusion, imitation learning, reinforcement learning, and other technical approaches:
Vision-Language-Action Models
- SmolVLA - Lightweight multimodal model that performs end-to-end learning of natural language instructions, visual perception, and robot actions
- OpenVLA - Large-scale pre-trained vision-language-action model supporting complex scene understanding and operation planning
Imitation Learning Models
- ACT (Action Chunking Transformer) - Transformer-based action chunking model that decomposes continuous action sequences into discrete chunks for learning
- PI0 - Zero-order policy optimization algorithm that quickly learns initial policies through expert demonstration data
- PI0Fast - Optimized version of PI0 algorithm with improved training strategies for faster convergence
Policy Learning Models
- Diffusion Policy - Policy learning based on diffusion processes, generating continuous robot action trajectories through denoising
- VQBET - Vector quantized behavior transformer that discretizes continuous action spaces and models them using Transformers
Reinforcement Learning Models
- SAC (Soft Actor-Critic) - Maximum entropy reinforcement learning algorithm that balances exploration and exploitation in continuous action spaces
- TDMPC - Temporal difference model predictive control, combining advantages of model-based planning and model-free learning
The above models cover mainstream technical approaches and can be applied to various robot tasks, for example:
Application Scenario | Model Used | Description |
---|---|---|
Desktop Organization | SmolVLA, PI0 | Robots can understand natural language instructions like "please organize the items on the desk" and execute grasping, moving, and placing actions |
Item Sorting | ACT | Through learning expert sorting demonstrations, robots can identify different items and sort them by category |
Complex Operation Tasks | Diffusion Policy | Robots can learn to execute complex operation sequences requiring precise control, such as assembly and cooking |
Adaptive Control | SAC and other RL algorithms | Robots can learn optimal control strategies in dynamic environments and adapt to environmental changes |
Training Workflow
The platform provides a productized training process that enables complete operations from data preparation to model deployment through web interfaces without requiring coding skills:
1. Data Preparation
The platform supports multiple data sources, including:
- Platform Export Data - Use robot demonstration data annotated and exported by the platform
- External Datasets - Import public datasets through URL links
- Local Data Upload - Support standard formats like HDF5, LeRobot
- HuggingFace Datasets - Directly obtain public data from HuggingFace Hub
2. Training Configuration
Computing Resource Selection
- Private Cloud Computing - Use dedicated GPU servers, suitable for long-term training tasks
- Public Cloud Resources - Support various cloud services like RunPod, AWS, Tencent Cloud, Alibaba Cloud
- GPU Selection - Real-time display of GPU status including memory usage, temperature, utilization
Model Architecture Selection
Choose appropriate models based on specific task requirements:
- For tasks requiring natural language instruction understanding, choose SmolVLA or OpenVLA
- For imitation learning tasks with expert demonstration data, choose ACT, PI0, or PI0Fast
- For tasks requiring online learning, choose SAC or TDMPC
Training Parameter Settings
- Basic Parameters - batch_size, training steps, random seed, etc.
- Optimization Parameters - learning rate, optimizer type, learning rate scheduling strategy
- Model Parameters - Model-specific parameters like ACT's chunk_size, observation steps
- Monitoring Parameters - evaluation frequency, log frequency, checkpoint saving strategy
After training starts, the platform provides complete monitoring and management functions:
Real-time Monitoring
- Training Metrics - Real-time visualization of key indicators like loss function, validation accuracy, learning rate
- Model Output - Prediction samples during training to observe model learning progress
- System Logs - Detailed training logs and error information for troubleshooting
Training Management
- Process Control - Support pause, resume, stop training tasks
- Checkpoint Management - Automatically save model checkpoints, support resuming training and version rollback
- Parameter Adjustment - Online adjustment of key parameters like learning rate
- Task Replication - Quickly create new tasks based on successful training configurations
4. Model Evaluation and Export
After training completion, the platform provides model export and one-click inference deployment:
With this, you can conveniently train your specialized models using the Embodiflow Data Platform and complete model deployment and real machine inference in the next chapter.