THE DATA FOUNDATION FOR PHYSICAL AI

Data Infrastructure for Physical AI and Humanoids

Omnidirectional 3D capture, enriched annotation, and simulation-ready data to train the next generation of humanoid robots and embodied AI.

PUBLISHED · MAY 2026

SABER

A Scalable Action-Based Embodied Dataset for Real-World VLA Adaptation. The first high-fidelity retail robotics action dataset built from natural human behavior.

44.8K samples · 100+ hours · 2.19× improvement
PUBLISHED · MARCH 2026

PRISM

Unifying physical AI knowledge across space, physics, and embodied action. Fine-tunes NVIDIA Cosmos-Reason2 to state-of-the-art on physical AI benchmarks.

66.6% error reduction · 270K samples · 11.8M frames
✦ Scarce Apex
Real-World 3D Capture
The training signal robots actually need
Simulation
Scalable, but bounded by real-world fidelity
Web Video
Abundant, flat, no 3D structure
Sim quality is bounded by the real data feeding it. We widen the apex.
WHY REAL-WORLD 3D DATA

Real-world 3D data is the scarce apex of Physical AI.

Today’s robots are trained on internet 2D video, simulation, or narrow-field-of-view lab capture. Each leaves a gap. Our omnidirectional 3D platform captures the layer the others can’t reach.

OUR PLATFORM

Purpose-built for Physical AI training data.

A proprietary capture system, eight years of production hardening, and an end-to-end pipeline that turns real environments into simulation-ready training data.

Dual-Stream Capture

Synchronized 360° exocentric (Alia) + egocentric (GoPro) in a single pass. Complete spatial context — what the robot sees and how it appears in the scene.

CVPR-Pedigree Optics

Single-shot 360° stereoscopic design from IIIT Hyderabad, peer-reviewed at CVPR 2016. The optical breakthrough behind every dataset we ship.

8 Years in Production

Deployed across AMRs, surveillance, and smart city. Real-world failure modes already engineered out before the first humanoid dataset.

32+ Patents

A decade of protected optical IP. The capture platform is the moat — every downstream product inherits its quality.

Explore the capture platform
REAL-TO-SIM-TO-REAL

From real-world capture to GR00T-ready training data, in one pipeline.

We don’t just sell a camera. We deliver real-world capture, annotation, 3D reconstruction, USD simulation export, and synthetic generation — as one infrastructure layer.

1

Capture

Synchronized Alia 360° + GoPro dual-stream in real environments

2

Annotate

AI-assisted (SAM2, Grounding DINO) + human QA — 10× faster

3

Reconstruct

3D Gaussian Splatting creates photorealistic scenes

4

Convert (USD)

Automated USD export with physics for Isaac Sim

5

Simulate

1,000+ frames/hour with domain randomization

6

Train

Validated on GR00T and Cosmos fine-tuning

🟢 NVIDIA Isaac Sim (Native USD)
🤗 Hugging Face LeRobot (RLDS)
📦 Open X-Embodiment
See the full pipeline
OFFERINGS

Three ways to deploy Physical AI faster.

Datasets

Richly annotated grocery datasets — 500 hours each, up to 13 modalities per frame, dual-stream synchronized. Ready for LeRobot and Open X-Embodiment.

Browse datasets →

Simulation Assets

Photorealistic digital-twin environments in USD format. 5 store environments, 2,000+ product assets with physics properties. Drop into Isaac Sim.

Browse the catalog →

Custom Capture

Our team deploys to your facility with Alia rigs and the full annotation pipeline. Your data stays exclusively yours.

Talk to our team →
RESEARCH

Validated on NVIDIA’s Physical AI stack.

SABER · Published May 2026

A Scalable Action-Based Embodied Dataset for Real-World VLA Adaptation. Fine-tunes NVIDIA GR00T N1.6 on dual-stream egocentric data to achieve 2.19× improvement over baselines in retail manipulation.

2.19×
Improvement
44.8K
Samples
91%
Fridge Success
PRISM · Published March 2026

The first dataset to unify space, physics, and embodied action in a single real-world deployment domain. Fine-tunes NVIDIA Cosmos-Reason2-2B to state-of-the-art on physical AI benchmarks.

66.6%
Error reduction
270K
Samples
Reasoning
GET STARTED

Building Physical AI? Let’s talk.

Whether you’re training humanoids, building world models, or deploying enterprise robotics — our data is built for your stack.