Chapter 5: Deployment
Core Questions
- Why is robotics data uniquely difficult for VLMs compared to pure vision/text?
- How do we leverage VLMs for auto-labeling at scale?
- How do we distill foundation model reasoning into real-time controllers?
Topics
3.1 Data Diversity & The Long Tail
- The unique challenges of robotics data
- Multi-modal sensor streams (vision, depth, proprioception)
- Edge cases and safety-critical scenarios
- Why internet-scale pre-training isn't enough
3.2 Semantic Supervision
- Using VLMs to auto-label petabytes of sensor data
- Bootstrapping low-level policy training
- Quality validation for automated labeling
- Human-in-the-loop verification strategies
3.3 Policy Distillation
- Transferring high-level reasoning to edge hardware
- Methods for compressing foundation models
- Real-time constraints and latency budgets
- Maintaining safety guarantees during distillation
3.4 Safety-Critical Scaling
- The LISS framework for autonomous fleets
- Formalizing "Similar Miles" validation
- Hardware-in-the-loop at scale
- OOD detection and graceful degradation in production