Concepts de base
Recognizing latent object characteristics using cross-modal transfer learning improves accuracy.
Résumé
This study focuses on recognizing hidden object characteristics in robotic manipulation tasks by leveraging a two-phase cross-modal transfer learning approach. The first phase involves training a vision module to observe object characteristics directly, while the second phase uses haptic-audio and motor data for indirect sensing. By transferring the learned latent space from vision to haptic-audio, the model can improve recognition accuracy of shape, position, and orientation of objects within containers. The study demonstrates successful online recognition of trained and untrained objects using a humanoid robot setup. Various experiments and evaluations showcase the effectiveness and potential applicability of the proposed method in enhancing robotic perception and manipulation.
Stats
We train this module for 5,000 epochs until the training error converges.
We train this module for 20,000 epochs using the Adam optimizer until the error converges.
We collected 270 images for training the first module with 30 images captured for each of the 9 training objects.
We recorded sequential data from tactile sensors, force-torque sensors, microphones, and end-effector configurations at a frequency of 50 Hz.
For testing the second module, we utilized 135 different sequential datasets with 15 datasets recorded for each of the 9 objects.
Citations
"Recognising latent object characteristics using cross-modal transfer learning improves accuracy."
"Our experiments show that the proposed method outperforms the baseline approach."
"The proposed model exhibits generalization capabilities successfully recognizing untrained objects."