Research
Research Highlights
PRISM: The multi-view video dataset that closes the knowledge gap holding physical AI back from real-world deployment.
PRISM Paper
NOW AVAILABLE
PRISM: Unifying physical AI knowledge across space, physics, and embodied action.
March 2026 · DreamVu Research Team
The Problem
State-of-the-art vision-language models can describe what they see, but they cannot reliably act in the world. The bottleneck is a structural gap in training data across space, physics, and embodied action. PRISM closes that gap.
Key Findings
- • Domain-specific SFT beats general pretraining, reducing average error by 66.6%.
- • Multi-view training is mutually reinforcing; exocentric supervision improves ego performance.
- • 60% of the data captures 95% of the total performance ceiling.
Impact Metrics
66.6%
Error rate reduction across all probes
5×
Embodied Reasoning error reduction (45.5% → 9.1%)
270K
Training samples across 5 real supermarkets