Assignment: The Technical Paper Audit
Weight: 20% (Document) + 10% (Presentation)
Initial Draft Due: Exactly 7 days prior to your scheduled presentation date.
Mastery Deadline: Sunday, April 19, 11:59 PM MST.
1. The Philosophy: Audit, Not Summary
In this course, we do not write "summaries." A summary is a transient document that parrots the authors' abstract. An Audit is a durable engineering resource intended for the vlm-robotics.dev living textbook.
As a technical auditor, your job is to pressure-test the paper’s claims from a first-principles perspective. You are looking for the "Information Decay"—the points where the model’s architectural choices sacrifice physical reality for computational ease. If this paper were a technical proposal at an organization like Zoox or Tesla, would you sign off on it? Why or why not?
2. The Mastery Loop (Grading & Feedback)
This assignment utilizes a Mastery-Based Feedback Loop. Your grade reflects the technical authority and durability of your documentation.
- The Review Candidate (T-minus 7 Days): Submit a draft of your MDX document via a Pull Request (PR) to the
staging branch.
- Instructor Review (IR): I will provide a technical review focusing on mathematical rigor, architectural clarity, and the depth of your critique.
- The Presentation: Deliver a "presentationified" version of your audit to the class.
- The Improvement Cycle: You are expected to iterate on your MDX document throughout the semester based on IR and class feedback.
- Level 1 (C): A correct summary with basic formatting.
- Level 2 (B): High technical depth, correct LATEX, and a sound critique of the model.
- Level 3 (A - Mastery): "Merge-Ready." The document is of such high quality that it is merged into the
main branch of the living textbook as a canonical reference.
3. Technical Requirements
Your audit must be authored in MDX and include the following sections.
I. Formal Architecture
You must formalize the model’s core logic using LATEX. Do not simply describe the objective; derive it.
- Define the state-space and the latent transitions.
- Explicitly detail the loss function L and any auxiliary objectives (e.g., contrastive or reconstruction losses).
- Example: Define the joint probability for an autoregressive VLA:
P(a1:T∣v1:T,l)=∏t=1TP(at∣v≤t,a<t,l)
II. The Instructor Audit Criteria
Address the following engineering bottlenecks:
- Dimensionality & Information Decay: Where is the bottleneck? Does the projection layer (v→z) discard critical spatial geometry (e.g., depth, texture, or contact physics) needed for the task?
- Compute & Inference Reality: Estimate the FLOPs for a single forward pass. At what frequency (Hz) can this model realistically run on edge hardware (e.g., NVIDIA Orin or H100)?
- The Semantic-Motor Gap: Identify exactly where the model's "high-level reasoning" (the LLM) fails to ground in "low-level motor primitives" (the policy).
III. Failure Mode Analysis
Document the "Obvious Bugs." Identify 2–3 scenarios where the model’s logic is likely to diverge from physical reality (e.g., handling transparent objects, high-velocity dynamic obstacles, or long-horizon causal chains).
4. Suggested Paper List by Focus Area
Refer to the Sign-up Sheet to claim your area and date.
Vision Foundations
- SigLIP, CLIP, Scaling Language-Free Visual Representation Learning, Scaling Laws for Robust Comparison.
Modality Alignment
- LLaVA, Prismatic VLMs, Qwen3-VL.
Action Tokenization
Scaling & Reasoning
- GRPO, DeepSeek, Cosmos-Reason.
World Models
- GAIA, Cosmos, Genie3, TesserAct.
Navigation
Manipulation
- Octo, RT-X, Pi-0.5, ManipDreamer.
Autonomous Driving
- Alpamayo-R1, EMMA, AlphaDrive.
Safety & Uncertainty
- Foundation Models for Rapid Autonomy Validation, SAFE.
Long-Horizon Planning
- SayCan, Voyager, Reflective Planning, Long-Horizon Planning with Predictable Skills.
5. Submission Logistics
- Branching: Create a branch
audit/your-name-paper-name.
- PR to Staging: Submit your MDX file to
content/textbook/audits/.
- Build Check: Your PR must pass the GitHub Action audit (no broken LATEX or malformed frontmatter).
- Instructor Sync: Once the build is green, tag
@crheckman for Instructor Review.