PRODUCT FAMILY · SABER
SABER — Vision-Language-Action Model Datasets
A Scalable Action-Based Embodied Dataset for Real-World VLA Adaptation. The first high-fidelity retail robotics action dataset built from natural human behavior — not teleoperated demos.
Annotation Depth: All 13 Modalities per Frame
SABER datasets include all 13 modalities — the 10 PRISM modalities plus 3 action-specific rows for VLA training.
1
Raw Omnidirectional Exocentric Capture
Full-sphere Exocentric RGB + depth from Alia 360° sensor
360° RGB
Dense Depth
IMU
Timestamps
ALL
2
Raw Egocentric Capture
RGB + depth from Ego camera(s)
RGB
Dense Depth
IMU
Timestamps
ALL
3
3D Reconstruction
Dense neural 3D scene from omnidirectional input
Point Clouds
Mesh
Gaussian Splats
3D Layout
3D Scene Graphs
ALL
4
Spatial Semantics
Navigable paths, obstacles, zones, and surfaces
Path Maps
Obstacle Class
Floor Plans
Zone Labels
ALL
5
Object Semantics
Per-object identity, class, pose, and attributes
3D Bounding Boxes
Instance Masks
SKU Labels
6-DoF Pose
ALL
6
Physics Metadata
Object mass, friction, deformability — sim-transfer-ready
Mass Estimates
Friction
Deformability
Collision Mesh
ALL
7
Agent Tracking
People, carts, and robots — trajectories over time
Body Keypoints
Trajectories
Re-ID
ALL
8
Skills & Activities
What each agent is doing — pick, place, scan, stack, mop …
500+ Skills
Verb–Object Pairs
Temporal Spans
Role Tags
ALL
9
Temporal Context
Time-of-day, traffic density, seasonal and layout variants
Rush / Off-Peak
Restocking
Layout Changes
Lighting
ALL
10
Ego-Exo Synchronization
RGB Video Synchronization
RGB Time Sync
Ego-Exo Action Mapping
ALL
11
Hand Pose & Trajectory
Frame-level manipulation actions from egocentric view
Gripper State
Contact Events
Hand Pose
Force Proxies
VLA ONLY
12
Body Pose & Trajectory
Frame-level manipulation actions from egocentric view
Gripper State
Contact Events
Hand Pose
Force Proxies
VLA ONLY
13
Human to Robot Retargeting
Conversion of human trajectories to standard robot joint data
Robot Motion
Robot Control
Supports Unitree G1, Fourier and many more
VLA ONLY
Modalities 1–10 — included in all datasets (PRISM & SABER)
Modalities 11–13 — SABER datasets only (action data)
SABER Grocery
Published May 2026
44.8K samples · 100+ hours · 2.19× improvement over baselines.
91% success rate on the fridge-restocking benchmark.
Fine-tunes NVIDIA GR00T N1.6 on dual-stream egocentric data.
Built from natural human behavior in real retail environments — not teleoperated demos.