本文提出了一個基於可供性(Affordance)和流匹配(Flow Matching)的新框架,用於機器人操控任務,特別是在日常生活場景中,能有效地將大型視覺語言模型適應於機器人操控策略學習,並生成多模態動作分佈。
사전 학습된 비디오 확산 모델(VDT)을 활용하여 로봇의 동작 예측 정확도를 향상시키는 2단계 학습 프레임워크인 VidMan을 소개합니다.
VidMan enhances robot manipulation precision by leveraging video diffusion models to learn environmental dynamics and predict actions, outperforming existing methods, especially in data-limited scenarios.
This research proposes a novel method for robots to efficiently arrange objects by strategically choosing between pick-and-place and pick-and-toss motions based on the task difficulty, which is determined by the placement environment.
VQ-ACE는 인간의 손 동작을 양자화된 잠재 공간에 임베딩하여 로봇이 손재주가 요구되는 조작 작업을 더 효율적으로 학습하고 수행할 수 있도록 하는 새로운 프레임워크입니다.
By densely annotating existing robot demonstration datasets with language-grounded, object-centric manipulation skills, STEER enables robots to adapt to new situations and perform novel tasks without additional data collection or training.
본 논문에서는 고밀도 촉각 정보를 활용하여 중력 방향에 관계없이 손 안에서 물체를 자유롭게 회전시키는 로봇 시스템 AnyRotate를 소개합니다.
Combining Task and Motion Planning (TAMP) with imitation and reinforcement learning, SPIRE efficiently teaches robots complex, long-horizon manipulation tasks by breaking them down into smaller, learnable segments and leveraging the strengths of each learning paradigm.
This paper introduces "Caging in Time," a novel framework for robust robot manipulation that overcomes uncertainties and limited perception by strategically sequencing robot motions to create a virtual cage, ensuring successful object manipulation even without real-time sensory feedback.