Making Robots Meet Your Gaze

The model generates gaze-shifting motions through a two-stage process: first, a conditional VQ-VAE reconstructs incremental eye and head rotations from observed motions and contextual inputs, and second, a conditional prior predicts distributions over codebook entries-selecting the most probable code during training to minimize loss, yet embracing stochastic sampling during operation to foster nuanced behavioral diversity.

New research details a framework for building more natural human-robot interactions by enabling robots to shift their gaze in a way that mimics human behavior.