VLA Stack

Vision-Language-Action for Robotics

The Vision-Language-Action Stack

A living reference of foundational architectures, rigorous validation strategies, and deploying robot foundation models.

The VLA Control Loop

s ∈ S

Perception

Scene Encoding

P(a|s,l)

VLM Backbone

Reasoning

a ∈ A

Action

Control Policies

a* = arg maxa P(a | s, l)

Foundational Pillars