Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V: Enabling Adaptive Reasoning and Failure Recovery for Real-World Robot Tasks
COME-robot, a closed-loop framework that integrates the vision-language model GPT-4V with a library of robust robotic primitives, enables open-vocabulary mobile manipulation in real-world environments through active perception, situated commonsense reasoning, and adaptive failure recovery.