Research Highlights
PRISM: The multi-view video dataset that closes the knowledge gap holding physical AI back from real-world deployment.
PRISM Paper
NOW AVAILABLE

PRISM: Unifying physical AI knowledge across space, physics, and embodied action.

March 2026 · DreamVu Research Team

The Problem

State-of-the-art vision-language models can describe what they see, but they cannot reliably act in the world. The bottleneck is a structural gap in training data across space, physics, and embodied action. PRISM closes that gap.

Key Findings

  • Domain-specific SFT beats general pretraining, reducing average error by 66.6%.
  • Multi-view training is mutually reinforcing; exocentric supervision improves ego performance.
  • 60% of the data captures 95% of the total performance ceiling.

Impact Metrics

66.6%
Error rate reduction across all probes
Embodied Reasoning error reduction (45.5% → 9.1%)
270K
Training samples across 5 real supermarkets