THE DATA FOUNDATION FOR PHYSICAL AI

Data Infrastructure for Physical AI and Humanoids

Omnidirectional 3D capture, enriched annotation, and simulation-ready data to train the next generation of humanoid robots and embodied AI.

Talk to Us Read our research →

PUBLISHED · MAY 2026

SABER

A Scalable Action-Based Embodied Dataset for Real-World VLA Adaptation. A high-fidelity action dataset built from natural human behavior.

44.8K samples · 100+ hours · 2.19X improvement

Read the paper → Demo →

PUBLISHED · MARCH 2026

PRISM

Unifying physical AI knowledge across space, physics, and embodied action. Fine-tunes NVIDIA Cosmos-Reason2 to state-of-the-art on physical AI benchmarks.

66.6% error reduction · 270K samples · 11.8M frames

Read the paper → Demo →

✦ SCARCE APEX

Real-World 3D Capture

The training signal robots actually need

Simulation

Scalable, but bounded by real-world fidelity

Web Video

Abundant, flat, no 3D structure

Sim quality is bounded by the real data feeding it. We widen the apex.

WHY REAL-WORLD 3D DATA

Real-world 3D data is the scarce apex of Physical AI.

Today’s robots are trained on internet 2D video, simulation, or narrow-field-of-view lab capture. Each leaves a gap. Our omnidirectional 3D platform captures the layer the others can’t reach.

OUR PLATFORM

Purpose-built for Physical AI training data.

A proprietary capture system, eight years of production hardening, and an end-to-end pipeline that turns real environments into simulation-ready training data.

Dual-Stream Capture

Synchronized 360° exocentric (Alia) + egocentric (GoPro) in a single pass. Complete spatial context — what the robot sees and how it appears in the scene.

CVPR-Pedigree Optics

Single-shot 360° stereoscopic design from IIIT Hyderabad, peer-reviewed at CVPR 2016. The optical breakthrough behind every dataset we ship.

8 Years in Production

Deployed across AMRs, surveillance, and smart city. Real-world failure modes already engineered out before the first humanoid dataset.

32+ Patents

A decade of protected optical IP. The capture platform is the moat — every downstream product inherits its quality.

Explore the capture platform →

REAL-TO-SIM-TO-REAL

One pipeline. Real-world capture to VLA/VLM-ready training data.

We deliver real-world capture, annotation, 3D reconstruction, USD simulation export, and synthetic generation — as one infrastructure layer.

Capture

Synchronized Alia 360° + GoPro dual-stream in real environments

→

Annotate

AI-assisted (SAM2, Grounding DINO) + human QA — 10× faster

→

Reconstruct

3D Gaussian Splatting creates photorealistic scenes

→

Convert (USD)

Automated USD export with physics for any USD-compatible simulator

→

Simulate

1,000+ frames/hour with domain randomization

→

Train

Ready for VLA and VLM fine-tuning

🟢 OpenUSD (Native, physics-ready)

🤗 Hugging Face LeRobot (RLDS)

📦 Open X-Embodiment

See the full pipeline →

OFFERINGS

Four ways to deploy Physical AI faster.

▣

PRISM — VLM Datasets

Vision-Language datasets that unify space, physics, and embodied action.

Explore PRISM →

▣

SABER — VLA Datasets

Vision-Language-Action datasets on dual-stream egocentric data.

Explore SABER →

▦

Simulation Assets

Photorealistic USD environments and product assets.

Explore simulation assets →

⚙

Custom Capture

Our team to your facility. Data stays yours.

Talk to our team →

See the industries we cover →

RESEARCH

Validated on leading Physical AI models.

SABER · Published May 2026

A Scalable Action-Based Embodied Dataset for Real-World VLA Adaptation. Fine-tunes NVIDIA GR00T N1.6 on dual-stream egocentric data to achieve 2.19X improvement over baselines in retail manipulation.

2.19X

Improvement

44.8K

Samples

91%

Fridge Success

Read the paper Hugging Face

PRISM · Published March 2026

The first dataset to unify space, physics, and embodied action in a single real-world deployment domain. Fine-tunes NVIDIA Cosmos-Reason2-2B to state-of-the-art on physical AI benchmarks.

66.6%

Error reduction

270K

Samples

5×

Reasoning

Read the paper Demo

GET STARTED

Building Physical AI? Let’s talk.

Whether you’re training humanoids, building world models, or deploying enterprise robotics — our data is built for your stack.

Talk to Us Download PRISM-100K — free dataset