数据格式
艾欧数据平台的设计目标是通用机器人数据管理,以 Robot Operating System (ROS) 为基准统一管理机器人数据。
- 数据导入:支持将智元,松灵等数采系统的非ROS标准的数据自动转换为ROS标准格式,进行统一管理。
- 数据可视化:内置了30多款主流机器人的可视化模型,可以流畅播放三维动画和平面图像等所有格式内容。
- 数据导出:支持一键导出标准HDF5/LeRobot数据格式,根据原始数据自适应关节和图像,可以直接投入模型训练。
目录
人类数据格式
人类数据采集主要用于记录操作者的动作和交互过程,包含多模态传感器数据。
文件结构
每个采集任务会生成一个以时间戳命名的文件夹:
f"{date}_{project}_{scene}_{task}_{staff_id}_{timestamp}"
├── align_result.csv # 时间戳对齐表格
├── annotation.json # 标注数据
├── config/ # 相机和传感器配置
│ ├─ ─ calib_data.yml
│ ├── depth_to_rgb.yml
│ ├── mocap_main.yml
│ ├── orbbec_depth.yml
│ ├── orbbec_rgb.yml
│ └── pose_calib.yml
└── data.mcap # 多模态数据包
多模态数据
data.mcap
文件包含所有传感器的同步数据,使用MCAP格式存储。
主要Topic列表:
Topic名称 | 数据类型 | 说明 |
---|---|---|
/mocap/sensor_data | io_msgs/squashed_mocap_data | 动作捕捉的关节速度、加速度、角速度、旋转角度和传感器数据 |
/mocap/ros_tf | tf2_msgs/TFMessage | 基于动作捕捉的所有关节的TF变换 |
/joint_states | sensor_msgs/JointState | 基于动作捕捉的所有关节的JointState |
/rgbd/color/image_raw/compressed | sensor_msgs/CompressedImage | 主头部相机的RGB图像 |
/rgbd/depth/image_raw | sensor_msgs/Image | 主头部相机的深度图像 |
/colorized_depth | sensor_msgs/CompressedImage | 主头部相机的彩色深度图像 |
/left_ee_pose | geometry_msgs/PoseStamped | 主头部相机坐标系下的左夹爪位姿 |
/right_ee_pose | geometry_msgs/PoseStamped | 主头部相机坐标系下的右夹爪位姿 |
/claws_l_hand | io_msgs/claws_angle | 左夹爪闭合程度 |
/claws_r_hand | io_msgs/claws_angle | 右夹爪闭合程度 |
/claws_touch_data | io_msgs/squashed_touch | 夹爪触觉数据 |
/realsense_left_hand/color/image_raw/compressed | sensor_msgs/CompressedImage | 左夹爪相机的RGB图像 |
/realsense_left_hand/depth/image_rect_raw | sensor_msgs/Image | 左夹爪相机的深度图像 |
/realsense_right_hand/color/image_raw/compressed | sensor_msgs/CompressedImage | 右夹爪相机的RGB图像 |
/realsense_right_hand/depth/image_rect_raw | sensor_msgs/Image | 右夹爪相机的深度图像 |
/usb_cam_fisheye/mjpeg_raw/compressed | sensor_msgs/CompressedImage | 主头部鱼眼相机的RGB图像 |
/usb_cam_left/mjpeg_raw/compressed | sensor_msgs/CompressedImage | 主头部左单目相机的RGB图像 |
/usb_cam_right/mjpeg_raw/compressed | sensor_msgs/CompressedImage | 主头部右单目相机的RGB图像 |
/ee_visualization | sensor_msgs/CompressedImage | 主头部相机RGB图像中的末端执行器位姿可视化 |
/touch_visualization | sensor_msgs/CompressedImage | 夹爪触觉数据可视化 |
/robot_description | std_msgs/String | 动作捕捉URDF |
/global_localization | geometry_msgs/PoseStamped | 主头部相机在世界坐标系中的位姿 |
/world_left_ee_pose | geometry_msgs/PoseStamped | 左夹爪在世界坐标系中的位姿 |
/world_right_ee_pose | geometry_msgs/PoseStamped | 右夹爪在世界坐标系中的位姿 |
相机数据:
- 主头部RGBD相机:彩色+深度图像
- 左/右夹爪相机:RealSense RGBD
- 鱼眼相机:全景视角
- 左/右单目相机:立体视觉
注意: 如果使用触觉手套,会额外增加
/mocap/touch_data
Topic。
点击查看原始MCAP数据格式
library: mcap go v1.7.0
profile: ros1
messages: 45200
duration: 1m5.625866496s
start: 2025-01-15T18:09:29.628202496+08:00 (1736935769.628202496)
end: 2025-01-15T18:10:35.254068992+08:00 (1736935835.254068992)
compression:
zstd: [764/764 chunks] [6.13 GiB/3.84 GiB (37.39%)] [59.87 MiB/sec]
channels:
(1) /rgbd/color/image_raw/compressed 1970 msgs (30.02 Hz) : sensor_msgs/CompressedImage [ros1msg]
(2) /joint_states 1970 msgs (30.02 Hz) : sensor_msgs/JointState [ros1msg]
(3) /claws_r_hand 1970 msgs (30.02 Hz) : io_msgs/claws_angle [ros1msg]
(4) /global_localization 1970 msgs (30.02 Hz) : geometry_msgs/PoseStamped [ros1msg]
(5) /robot_description 1 msgs : std_msgs/String [ros1msg]
(6) /ee_visualization 1970 msgs (30.02 Hz) : sensor_msgs/CompressedImage [ros1msg]
(7) /rgbd/depth/image_raw 1970 msgs (30.02 Hz) : sensor_msgs/Image [ros1msg]
(8) /colorized_depth 1970 msgs (30.02 Hz) : sensor_msgs/CompressedImage [ros1msg]
(9) /claws_l_hand 1970 msgs (30.02 Hz) : io_msgs/claws_angle [ros1msg]
(10) /claws_touch_data 1970 msgs (30.02 Hz) : io_msgs/squashed_touch [ros1msg]
(11) /touch_visualization 1970 msgs (30.02 Hz) : sensor_msgs/CompressedImage [ros1msg]
(12) /mocap/sensor_data 1970 msgs (30.02 Hz) : io_msgs/squashed_mocap_data [ros1msg]
(13) /mocap/ros_tf 1970 msgs (30.02 Hz) : tf2_msgs/TFMessage [ros1msg]
(14) /left_ee_pose 1970 msgs (30.02 Hz) : geometry_msgs/PoseStamped [ros1msg]
(15) /right_ee_pose 1970 msgs (30.02 Hz) : geometry_msgs/PoseStamped [ros1msg]
(16) /usb_cam_left/mjpeg_raw/compressed 1960 msgs (29.87 Hz) : sensor_msgs/CompressedImage [ros1msg]
(17) /usb_cam_right/mjpeg_raw/compressed 1946 msgs (29.65 Hz) : sensor_msgs/CompressedImage [ros1msg]
(18) /usb_cam_fisheye/mjpeg_raw/compressed 1957 msgs (29.82 Hz) : sensor_msgs/CompressedImage [ros1msg]
(19) /realsense_left_hand/depth/image_rect_raw 1961 msgs (29.88 Hz) : sensor_msgs/Image [ros1msg]
(20) /realsense_left_hand/color/image_raw/compressed 1961 msgs (29.88 Hz) : sensor_msgs/CompressedImage [ros1msg]
(21) /realsense_right_hand/depth/image_rect_raw 1947 msgs (29.67 Hz) : sensor_msgs/Image [ros1msg]
(22) /realsense_right_hand/color/image_raw/compressed 1947 msgs (29.67 Hz) : sensor_msgs/CompressedImage [ros1msg]
(23) /world_left_ee_pose 1970 msgs (30.02 Hz) : geometry_msgs/PoseStamped [ros1msg]
(24) /world_right_ee_pose 1970 msgs (30.02 Hz) : geometry_msgs/PoseStamped [ros1msg]
channels: 24
attachments: 0
metadata: 0
Topic名称 | 数据含义 |
---|---|
/mocap/sensor_data | 基于动作捕捉的关节速度、加速度、角速度、旋转角度和传感器数据 |
/mocap/ros_tf | 基于动作捕捉的所有关节的TF |
/joint_states | 基于动作捕捉的所有关节的JointState |
/right_ee_pose | 主头部相机坐标系下的右夹爪位姿 |
/left_ee_pose | 主头部相机坐标系下的左夹爪位姿 |
/claws_l_hand | 左夹爪闭合程度 |
/claws_r_hand | 右夹爪闭合程度 |
/claws_touch_data | 夹爪触觉数据(包含两个消息,每个消息的frame_id表示左或右夹爪,data的前四个值有效) |
/realsense_left_hand/color/image_raw/compressed | 左夹爪相机的RGB图像 |
/realsense_left_hand/depth/image_rect_raw | 左夹爪相机的深度图像 |
/realsense_right_hand/color/image_raw/compressed | 右夹爪相机的RGB图像 |
/realsense_right_hand/depth/image_rect_raw | 右夹爪相机的深度图像 |
/rgbd/color/image_raw/compressed | 主头部相机的RGB图像 |
/rgbd/depth/image_raw | 主头部相机的深度图像 |
/colorized_depth | 主头部相机的彩色深度图像 |
/usb_cam_fisheye/mjpeg_raw/compressed | 主头部鱼眼相机的RGB图像 |
/usb_cam_left/mjpeg_raw/compressed | 主头部左单目相机的RGB图像 |
/usb_cam_right/mjpeg_raw/compressed | 主头部右单目相机的RGB图像 |
/ee_visualization | 主头部相机RGB图像中的末端执行器位姿可视化 |
/touch_visualization | 夹爪触觉数据可视化 |
/robot_description | 动作捕捉URDF |
/global_localization | 主头部相机在世界坐标系中的位姿 |
/world_left_ee_pose | 左夹爪在世界坐标系中的位姿 |
/world_right_ee_pose | 右夹爪在世界坐标系中的位姿 |
如果是人穿戴着触觉手套采集的数据,会增加触觉数字信号阵列Topic:
/mocap/touch_data 57 msgs (30.25 Hz): io_msgs/squashed_touc [ros1msg]
自然语言标注
{
"belong_to": "20250115_InnerTest_PublicArea_TableClearing_szk_180926",
"mocap_offset": [],
"object_set": [
"paper cup",
"placemat",
"trash can",
"napkin",
"plate",
"dinner knife",
"tableware storage box",
"wine glass",
"dinner fork"
],
"scene": "PublicArea",
"skill_set": [
"pick {A} from {B}",
"toss {A} into {B}",
"place {A} on {B}"
],
"subtasks": [
{
"skill": "pick {A} from {B}",
"description": "pick the paper cup from the placemat with the left gripper",
"description_zh": "左夹爪 从 餐垫 捡起 纸杯",
"end_frame_id": 227,
"end_timestamp": "1736935777206000000",
"sequence_id": 1,
"start_frame_id": 159,
"start_timestamp": "1736935774906000000",
"comment": "",
"attempts": "success"
},
{
"skill": "toss {A} into {B}",
"description": "toss the paper cup into the trash can with the left gripper",
"description_zh": "左夹爪 扔纸杯进垃圾桶",
"end_frame_id": 318,
"end_timestamp": "1736935780244000000",
"sequence_id": 2,
"start_frame_id": 231,
"start_timestamp": "1736935777306000000",
"comment": "",
"attempts": "success"
},
...
],
"tag_set": [],
"task_description": "20250115_InnerTest_PublicArea_TableClearing_szk_180926"
}
遥操作机器人数据格式
遥操作机器人数据记录操作者通过VR设备控制机器人的过程。
遥操作文件结构
f"{robot_name}_{date}_{timestamp}_{sequence_id}"
├── RM_AIDAL_250124_172033_0.mcap # 多模态数据
├── RM_AIDAL_250124_172033_0.json # 标注数据
└── RM_AIDAL_250126_093648_0.metadata.yaml # 元数据
遥操作多模态数据
主要Topic列表:
Topic名称 | 数据类型 | 说明 |
---|---|---|
/camera_01/color/image_raw/compressed | sensor_msgs/msg/CompressedImage | 主相机的RGB图像 |
/camera_02/color/image_raw/compressed | sensor_msgs/msg/CompressedImage | 左相机的RGB图像 |
/camera_03/color/image_raw/compressed | sensor_msgs/msg/CompressedImage | 右相机的RGB图像 |
io_teleop/joint_states | sensor_msgs/msg/JointState | 关节状态 |
io_teleop/joint_cmd | sensor_msgs/msg/JointState | 关节命令 |
io_teleop/target_ee_poses | geometry_msgs/msg/PoseArray | 末端执行器目标位姿 |
io_teleop/target_base_move | std_msgs/msg/Float64MultiArray | 基座移动目标 |
io_teleop/target_gripper_status | sensor_msgs/msg/JointState | 夹爪状态目标 |
io_teleop/target_joint_from_vr | sensor_msgs/msg/JointState | VR设备的关节目标 |
/robot_description | std_msgs/msg/String | 机器人URDF描述 |
/tf | tf2_msgs/msg/TFMessage | TF空间位姿变换信息 |
点击查看原始MCAP数据格式
Files: RM_AIDAL_250126_091041_0.mcap
Bag size: 443.3 MiB
Storage id: mcap
Duration: 100.052164792s
Start: Jan 24 2025 21:37:32.526605552 (1737725852.526605552)
End: Jan 24 2025 21:39:12.578770344 (1737725952.578770344)
Messages: 62116
Topic information: Topic: /camera_01/color/image_raw/compressed | Type: sensor_msgs/msg/CompressedImage | Count: 3000 | Serialization Format: cdr
Topic: /camera_02/color/image_raw/compressed | Type: sensor_msgs/msg/CompressedImage | Count: 3000 | Serialization Format: cdr
Topic: /camera_03/color/image_raw/compressed | Type: sensor_msgs/msg/CompressedImage | Count: 3000 | Serialization Format: cdr
Topic: io_teleop/joint_states | Type: sensor_msgs/msg/JointState | Count: 1529 | Serialization Format: cdr
Topic: io_teleop/joint_cmd | Type: sensor_msgs/msg/JointState | Count: 10009 | Serialization Format: cdr
Topic: io_teleop/target_ee_poses | Type: geometry_msgs/msg/PoseArray | Count: 10014 | Serialization Format: cdr
Topic: io_teleop/target_base_move | Type: std_msgs/msg/Float64MultiArray | Count: 10010 | Serialization Format: cdr
Topic: io_teleop/target_gripper_status | Type: sensor_msgs/msg/JointState | Count: 10012 | Serialization Format: cdr
Topic: io_teleop/target_joint_from_vr | Type: sensor_msgs/msg/JointState | Count: 10012 | Serialization Format: cdr
Topic: /robot_description | Type: std_msgs/msg/String | Count: 1 | Serialization Format: cdr
Topic: /tf | Type: tf2_msgs/msg/TFMessage | Count: 1529 | Serialization Format: cdr
Topic名称 | 数据含义 |
---|---|
/camera_01/color/image_raw/compressed | 主相机的RGB图像 |
/camera_02/color/image_raw/compressed | 左相机的RGB图像 |
/camera_03/color/image_raw/compressed | 右相机的RGB图像 |
io_teleop/joint_states | 关节状态 |
io_teleop/joint_cmd | 关节命令 |
io_teleop/target_ee_poses | 末端执行器目标位姿 |
io_teleop/target_base_move | 基座移动目标 |
io_teleop/target_gripper_status | 夹爪状态目标 |
io_teleop/target_joint_from_vr | VR设备的关节目标 |
/robot_description | 机器人URDF描述 |
/tf | TF空间位姿变换信息 |
遥操作标注数据
{
"belong_to": "RM_AIDAL_250126_091041_0",
"mocap_offset": [],
"object_set": [
"lemon candy",
"plate",
"pistachios"
],
"scene": "250126",
"skill_set": [
"place {A} on {B}"
],
"subtasks": [
{
"skill": "place {A} on {B}",
"objecta": "lemon candy",
"objectb": "plate",
"options": [
"leftHand"
],
"description": "place the lemon candy on the plate with the left hand",
"end_timestamp": "1737725886915000000",
"sequence_id": 1,
"start_timestamp": "1737725880757000000",
"comment": "",
"attempts": "success"
},
{
"skill": "place {A} on {B}",
"objecta": "pistachios",
"objectb": "plate",
"options": [
"rightHand"
],
"description": "place the pistachios on the plate with the right hand",
"end_timestamp": "1737725950745000000",
"sequence_id": 2,
"start_timestamp": "1737725941657000000",
"comment": "",
"attempts": "success"
}
],
"tag_set": [],
"task_description": "20250205_RM_ItemPacking_zhouxw"
}
导出模型训练数据
为了能方便地进行模型训练,平台提供了多种数据导出的能力,可以将原始采集的MCAP和JSON数据需要转换为适合机器学习训练的格式。
常见的HDF5和LeRobot格式都可以一键导出,并且不同的机器人或者传感器数量都能够自适应,无需人为配置。
HDF5格式
HDF5格式适合大规模数据存储和快速访问,采用分层结构 组织数据。
文件结构:
chunk_001.hdf5
├── /data/ # 数据组
│ ├── episode_001/ # 第一个任务序列
│ │ ├── action # 关节指令 (多维数组)
│ │ ├── observation.state # 传感器观测值
│ │ ├── observation.gripper # 夹爪状态
│ │ └── observation.images.* # 各视角图像
│ └── episode_002/ # 第二个任务序列
└── /meta/ # 元数据组
数据内容:
action
- 关节控制指令 (float32数组)observation.state
- 传感器观测值 (float32数组)observation.images.*
- 压缩图像数据 (JPEG格式)observation.gripper
- 夹爪状态 (float32数组)task
- 英文自然语言描述task_zh
- 中文自然语言描述score
- 动作质量评分
LeRobot格式
LeRobot格式是机器人学习领域的标准数据格式,兼容主流机器人学习框架。
参考样例数据: https://huggingface.co/datasets/io-ai-data/uncap_pen
数据特征定义:
导出LeRobot数据集的长度和Shape都会自动适应,支持任意相机数量或任意关节数量,这里的Shape是针对松灵桌面7自由度机械臂导出的格式:
特征名称 | 数据类型 | Shape | 说明 |
---|---|---|---|
action | float32 | [14] | 关节指令 (左右臂各7个关节) |
observation.state | float32 | [14] | 关节状态 (左右臂各7个关节) |
observation.images.cam_high | image | [3,480,640] | 高位相机图像 |
observation.images.cam_low | image | [3,480,640] | 低位相机图像 |
observation.images.cam_left_wrist | image | [3,480,640] | 左腕相机图像 |
observation.images.cam_right_wrist | image | [3,480,640] | 右腕相机图像 |
timestamp | float32 | [1] | 时间戳 |
frame_index | int64 | [1] | 帧索引 |
episode_index | int64 | [1] | 任务序列索引 |
点击查看完整LeRobot格式定义示例
{
"codebase_version": "v2.1",
"robot_type": "custom_arm",
"total_episodes": 20,
"total_frames": 5134,
"total_tasks": 20,
"total_videos": 0,
"total_chunks": 1,
"chunks_size": 1000,
"fps": 30,
"splits": {
"train": "0:20"
},
"data_path": "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet",
"video_path": "videos/chunk-{episode_chunk:03d}/{video_key}/episode_{episode_index:06d}.mp4",
"features": {
"observation.images.camera_01": {
"dtype": "image",
"shape": [
480,
640,
3
]
},
"observation.images.camera_02": {
"dtype": "image",
"shape": [
480,
640,
3
]
},
"observation.images.camera_03": {
"dtype": "image",
"shape": [
480,
640,
3
]
},
"observation.images.camera_04": {
"dtype": "image",
"shape": [
480,
640,
3
]
},
"observation.state": {
"dtype": "float64",
"shape": [
37
],
"names": [
"r_joint1",
"r_joint2",
"r_joint3",
"r_joint4",
"r_joint5",
"r_joint6",
"l_joint1",
"l_joint2",
"l_joint3",
"l_joint4",
"l_joint5",
"l_joint6",
"R_thumb_MCP_joint1",
"R_thumb_MCP_joint2",
"R_thumb_PIP_joint",
"R_thumb_DIP_joint",
"R_index_MCP_joint",
"R_index_DIP_joint",
"R_middle_MCP_joint",
"R_middle_DIP_joint",
"R_ring_MCP_joint",
"R_ring_DIP_joint",
"R_pinky_MCP_joint",
"R_pinky_DIP_joint",
"L_thumb_MCP_joint1",
"L_thumb_MCP_joint2",
"L_thumb_PIP_joint",
"L_thumb_DIP_joint",
"L_index_MCP_joint",
"L_index_DIP_joint",
"L_middle_MCP_joint",
"L_middle_DIP_joint",
"L_ring_MCP_joint",
"L_ring_DIP_joint",
"L_pinky_MCP_joint",
"L_pinky_DIP_joint",
"platform_joint"
]
},
"action": {
"dtype": "float64",
"shape": [
12
],
"names": [
"l_joint1",
"l_joint2",
"l_joint3",
"l_joint4",
"l_joint5",
"l_joint6",
"r_joint1",
"r_joint2",
"r_joint3",
"r_joint4",
"r_joint5",
"r_joint6"
]
},
"observation.gripper": {
"dtype": "float64",
"shape": [
2
],
"names": [
"right_gripper",
"left_gripper"
]
},
"timestamp": {
"dtype": "float32",
"shape": [
1
],
"names": null
},
"frame_index": {
"dtype": "int64",
"shape": [
1
],
"names": null
},
"episode_index": {
"dtype": "int64",
"shape": [
1
],
"names": null
},
"index": {
"dtype": "int64",
"shape": [
1
],
"names": null
},
"task_index": {
"dtype": "int64",
"shape": [
1
],
"names": null
}
}
}