Data as a Service
If you only need large-scale training data for machine learning, we offer professional embodied intelligence data collection services.
Purchase Existing Data
We have already collected and annotated over 300TB of human and robot teleoperation data, covering a wide variety of production and daily life scenarios. These multimodal datasets are ready for rapid delivery and training.
Covered scenarios include:
- Home environments such as kitchens, living rooms, bedrooms, laundry rooms
- Indoor spaces like apartments, offices, shopping malls, warehouses
- Industrial settings including production lines, factories, laboratories
- Outdoor environments such as streets, campuses, parking lots, green spaces
- Industry-specific applications in healthcare, education, transportation, retail, and more
Each scenario includes comprehensive multimodal data such as images, point cloud depth, IMU sensors, finger tactile data, and gripper data, meeting diverse model training requirements.
Collect New Data
We have a professional data collection and annotation operations team with experience serving leading companies across industries. We can quickly set up scenes based on your specific needs, providing customized collection tasks and annotation solutions to deliver large-scale training data.
We support same-day collection, same-day annotation, and next-day upload and delivery, enabling rapid data iteration.
Delivery Methods
1. Data Delivery via Cloud Services
We recommend using cloud services for data transfer and storage, such as Tencent Cloud Object Storage and other high-capacity, high-bandwidth providers.
2. Data Delivery via Hard Drives and Other Methods
We can also deliver data using customer-specified methods, such as shipping hard drives.
Delivery Content
Delivery formats can be flexibly adjusted based on raw data and customer requirements. We provide conversion services or tools for model-specific training formats.
Delivered data includes raw collected data (multi-channel images, IMU sensors, depth point clouds, finger tactile data, etc.), time-aligned annotated data, and various formats such as natural language segmented by annotation time. For basic data formats, see Data Structure.
We can also provide data in formats directly usable for model training, such as the LeRobot dataset or RLDS dataset.