Skip to main content

Model Inference

The Embodiflow Data Platform provides comprehensive model inference services, supporting one-click deployment of trained robot learning models as production-grade inference services. The platform supports multiple model formats and flexible deployment methods, providing full-scenario AI inference capabilities from cloud to edge for robot applications.

Product Features

The platform provides a complete pipeline from model training to inference deployment, supporting multiple inference validation and deployment methods:

Inference MethodApplication ScenarioDescription
Simulation Inference TestQuick ValidationUse random data or custom inputs to quickly verify model inference functionality and performance
MCAP File TestReal Data ValidationUse recorded robot demonstration data to verify model inference effects in real scenarios
Offline Edge DeploymentProduction Environment ApplicationDeploy inference services to robot local GPUs for low-latency real-time control

Inference Workflow

The platform provides a productized inference deployment process, implementing complete operations from model selection to production deployment through visual interfaces without requiring programming experience:

1. Model Source Selection

The platform supports multiple model sources:

  • Use Fine-tuned Model - Select models trained on the platform, automatically inherit training configurations
  • Upload Custom Model - Support mainstream formats like SafeTensors, PyTorch, ONNX
  • Use Pre-trained Model - Provide verified base models for quick startup

New inference service page provides multiple model deployment options

2. Service Configuration and Deployment

After deployment completion, inference services provide comprehensive status monitoring:

Service Information

  • Host Address and Port - Inference API access address
  • WebSocket Connection - Real-time inference connection information
  • Resource Usage - Real-time monitoring of CPU and memory usage
  • Container Status - Docker container running status and ID

Inference service detail page shows service status and configuration information

Model Input/Output Specifications

Inference services have intelligent adaptation capabilities, automatically recognizing and adapting to different models' input/output requirements:

  • Image Input - Intelligent adaptation of camera count (one or multiple views) and resolution (automatic scaling)
  • State Input - observation.state [12], observation.gripper [2], observation.score [1]
  • Action Output - action [12] robot joint control commands

info

The above information displays the complete configuration of inference services, helping users understand model input/output requirements to ensure correct use of inference functionality.

Inference Testing Features

Simulation Inference Test

Simulation inference page supports random data generation and inference testing

Simulation inference functionality provides convenient inference service validation methods:

  • Natural Language Tasks - Input robot execution instructions like "Pick up the apple and place it in the basket"
  • Intelligent Data Generation - One-click random filling of test data including image files and joint state values
  • Instant Inference Execution - Click send button to immediately get model inference results
  • Performance Indicator Display - Real-time display of key metrics like request time (2479ms) and inference time (2356ms)

MCAP File Test

MCAP file test page supports using real data for inference validation

MCAP file test functionality supports using real robot demonstration data for inference validation:

  • Data File Upload - Select MCAP data files containing complete robot operation processes
  • Intelligent Data Parsing - System automatically extracts multimodal data (image sequences, joint states, sensor data)
  • Sequential Batch Inference - Perform continuous inference on complete action sequences to verify model temporal consistency
  • Effect Comparison Analysis - Quantitatively compare inference results with original expert demonstrations

Offline Edge Deployment

Offline deployment page provides complete edge device deployment solutions

Offline edge deployment functionality completely migrates inference services to robot local GPU devices for production-grade applications:

Standardized Deployment Process

  1. Environment Preparation - Install necessary Python dependency packages on robot controllers
  2. Image Download - Obtain complete Docker images containing inference environment, model weights, and configurations
  3. Service Startup - Start inference services on local GPUs through Docker commands
  4. Client Connection - Run ROS client scripts to establish real-time communication with inference services

Production Application Advantages

  • Edge Computing Architecture - Inference executes locally on robots, completely eliminating network latency and dependencies
  • Deep ROS Integration - Seamlessly subscribe to sensor topics and directly publish joint control commands
  • Real-time Closed-loop Control - Support high-frequency (2-10Hz) perception-decision-execution loops
  • Industrial-grade Reliability - Suitable for industrial production environments with network limitations or high security requirements

Through the Embodiflow Data Platform's inference services, you can seamlessly deploy trained robot learning models to production environments, from cloud validation to edge deployment, achieving complete model application closed loops.