Core Concepts
SENSOR proposes an active vision framework to automatically adjust the agent's perspective to match the expert's, enabling efficient imitation of expert behaviors in third-person visual imitation learning tasks.
Abstract
The paper introduces SENSOR, a model-based algorithm for third-person visual imitation learning that leverages active sensoring to address the perspective mismatch between the agent and the expert.
Key highlights:
- Previous domain alignment methods struggle to handle large viewpoint gaps between the agent and the expert, as they fail to completely remove domain information from the learned representations.
- SENSOR introduces an active vision framework that jointly learns a world model, a sensor policy to control the camera, and a motor policy to control the agent's actions.
- The sensor policy automatically adjusts the agent's viewpoint to match the expert's, effectively reducing the third-person imitation learning problem to a simple imitation learning case.
- SENSOR also incorporates a discriminator ensemble and an adaptive exploration-exploitation reward function to stabilize the learning process and improve performance.
- Experiments on visual locomotion tasks demonstrate that SENSOR outperforms existing methods in terms of both performance and stability, especially in cases with large initial viewpoint differences.
- The paper also analyzes the limitations of decoupling motor and sensor dynamics, showing that they cannot be completely separated due to the interdependence between the agent's actions and the camera viewpoint.
Stats
The agent's initial viewpoint is specified by a tuple (d, a, e), where d is the distance from the camera to the target point, a is the horizontal angle, and e is the vertical angle relative to the target.
The expert's viewpoint is fixed at (3, 90, -45).
Quotes
"To the best of our knowledge, we are the first to introduce active sensoring in the visual IL setting to tackle IL problems from different viewpoints."
"We provide insights into understanding domain alignment methods by quantifying the task's difficulty with mutual information."
"We theoretically analyze the limitations of decoupled dynamics."