Three product families. Plus Custom Capture.
PRISM Vision-Language datasets. SABER Vision-Language-Action datasets. Simulation-ready USD environments. Custom Capture engagements. All built on the same dual-stream capture platform — extending across twelve industries.
PRISM: VLM Datasets
2 datasets · 10 modalities
PRODUCT FAMILY · PRISM

PRISM — Vision-Language Model Datasets

Richly annotated dual-stream datasets for fine-tuning frontier Vision-Language Models on real-world physical AI tasks.

Annotation Depth: 10 Modalities per Frame
PRISM datasets include modalities 1–10 — the rows labeled "ALL" in the stack below. Every dataset ships with this full annotation stack.
1
Raw Omnidirectional Exocentric Capture
Full-sphere Exocentric RGB + depth from Alia 360° sensor
360° RGB Dense Depth IMU Timestamps
ALL
2
Raw Egocentric Capture
RGB + depth from Ego camera(s)
RGB Dense Depth IMU Timestamps
ALL
3
3D Reconstruction
Dense neural 3D scene from omnidirectional input
Point Clouds Mesh Gaussian Splats 3D Layout 3D Scene Graphs
ALL
4
Spatial Semantics
Navigable paths, obstacles, zones, and surfaces
Path Maps Obstacle Class Floor Plans Zone Labels
ALL
5
Object Semantics
Per-object identity, class, pose, and attributes
3D Bounding Boxes Instance Masks SKU Labels 6-DoF Pose
ALL
6
Physics Metadata
Object mass, friction, deformability — sim-transfer-ready
Mass Estimates Friction Deformability Collision Mesh
ALL
7
Agent Tracking
People, carts, and robots — trajectories over time
Body Keypoints Trajectories Re-ID
ALL
8
Skills & Activities
What each agent is doing — pick, place, scan, stack, mop …
500+ Skills Verb–Object Pairs Temporal Spans Role Tags
ALL
9
Temporal Context
Time-of-day, traffic density, seasonal and layout variants
Rush / Off-Peak Restocking Layout Changes Lighting
ALL
10
Ego-Exo Synchronization
RGB Video Synchronization
RGB Time Sync Ego-Exo Action Mapping
ALL
Modalities 1–10 — included in all PRISM datasets
PRISM Grocery Published March 2026
270,000 annotated samples · 11.8M frames · 500 hours dual-stream capture.
66.6% error reduction on physical AI benchmarks; fine-tunes NVIDIA Cosmos-Reason2-2B.
PRISM releases are planned across all twelve industries on /industries.
SABER: VLA Datasets
2 datasets · 13 modalities · dual-stream
PRODUCT FAMILY · SABER

SABER — Vision-Language-Action Model Datasets

A Scalable Action-Based Embodied Dataset for Real-World VLA Adaptation. The first high-fidelity retail robotics action dataset built from natural human behavior — not teleoperated demos.

Annotation Depth: All 13 Modalities per Frame
SABER datasets include all 13 modalities — the 10 PRISM modalities plus 3 action-specific rows for VLA training.
1
Raw Omnidirectional Exocentric Capture
Full-sphere Exocentric RGB + depth from Alia 360° sensor
360° RGB Dense Depth IMU Timestamps
ALL
2
Raw Egocentric Capture
RGB + depth from Ego camera(s)
RGB Dense Depth IMU Timestamps
ALL
3
3D Reconstruction
Dense neural 3D scene from omnidirectional input
Point Clouds Mesh Gaussian Splats 3D Layout 3D Scene Graphs
ALL
4
Spatial Semantics
Navigable paths, obstacles, zones, and surfaces
Path Maps Obstacle Class Floor Plans Zone Labels
ALL
5
Object Semantics
Per-object identity, class, pose, and attributes
3D Bounding Boxes Instance Masks SKU Labels 6-DoF Pose
ALL
6
Physics Metadata
Object mass, friction, deformability — sim-transfer-ready
Mass Estimates Friction Deformability Collision Mesh
ALL
7
Agent Tracking
People, carts, and robots — trajectories over time
Body Keypoints Trajectories Re-ID
ALL
8
Skills & Activities
What each agent is doing — pick, place, scan, stack, mop …
500+ Skills Verb–Object Pairs Temporal Spans Role Tags
ALL
9
Temporal Context
Time-of-day, traffic density, seasonal and layout variants
Rush / Off-Peak Restocking Layout Changes Lighting
ALL
10
Ego-Exo Synchronization
RGB Video Synchronization
RGB Time Sync Ego-Exo Action Mapping
ALL
11
Hand Pose & Trajectory
Frame-level manipulation actions from egocentric view
Gripper State Contact Events Hand Pose Force Proxies
VLA ONLY
12
Body Pose & Trajectory
Frame-level manipulation actions from egocentric view
Gripper State Contact Events Hand Pose Force Proxies
VLA ONLY
13
Human to Robot Retargeting
Conversion of human trajectories to standard robot joint data
Robot Motion Robot Control Supports Unitree G1, Fourier and many more
VLA ONLY
Modalities 1–10 — included in all datasets (PRISM & SABER)
Modalities 11–13 — SABER datasets only (action data)
SABER Grocery Published May 2026
44.8K samples · 100+ hours · 2.19× improvement over baselines.
91% success rate on the fridge-restocking benchmark.
Fine-tunes NVIDIA GR00T N1.6 on dual-stream egocentric data.
Built from natural human behavior in real retail environments — not teleoperated demos.
SABER releases follow PRISM releases across the twelve industries on /industries.
Simulation Assets
2 asset types — USD format

Simulation-ready grocery environments and objects in Universal Scene Description (USD) format, built from DreamVu's 3D reconstructions. Drop directly into NVIDIA Isaac Sim or Omniverse for robot training, validation, and sim-to-real transfer.

1
Complete USD Store Environments
5 photorealistic digital twin grocery stores — full aisle layouts, checkout areas, stockrooms, and deli counters
2
2,000+ Individual Product Assets
Grocery items across produce, dairy, packaged goods, and frozen — each with accurate geometry and PBR textures
3
Physics Properties per Object
Mass, friction coefficients, deformability, and collision meshes — ready for contact-rich manipulation simulation
4
Shelf Planograms & Aisle Topology
Real product placement layouts with navigable aisle graphs — matches actual store configurations
5
Lighting & Layout Variants
Multiple lighting presets (daylight, fluorescent, night) and seasonal layout configurations per store
6
SKU & Barcode Metadata
Product identity, category labels, and barcode data attached to every asset for scan-and-pick training
7
Grasp Point Annotations
Pre-computed grasping points and approach vectors for each product asset — accelerates manipulation policy training
8
Isaac Sim & Omniverse Compatible
Native USD/USDA format — load directly into NVIDIA simulation tools with zero conversion overhead
Custom Capture
On-site data collection & processing

DreamVu deploys to your facility with our Alia 360° capture rigs and full annotation pipeline. You get the same multi-layer annotation stack applied to your specific environment, operations, and use cases.

📷
On-Site Capture
Alia 360° rigs deployed to your location — warehouses, factories, retail, hospitals, or any operational environment.
🎯
Custom Annotation
Full 7- or 9-layer annotation stack tailored to your domain-specific skills, objects, and workflows.
🛠
3D Reconstruction
NuRec pipeline produces dense 3D scenes — point clouds, meshes, and Gaussian splats from your facility.
USD Digital Twin
Your environment converted to simulation-ready USD assets for Isaac Sim and Omniverse workflows.
📊
Model Fine-Tuning
Optional foundational model fine-tuning on your custom dataset for domain-specific world models or robot policies.
🔒
Exclusive License
Custom capture data is exclusively yours — never shared, sublicensed, or added to our catalog without your consent.

Ready to build with real-world data?

Talk to our team about datasets, simulation assets, or custom capture for your environment.

Talk to Us