MemoNav: Working Memory Model for Visual Navigation
Core Concepts
MemoNav utilizes a working memory-inspired pipeline to improve navigation performance by incorporating short-term memory (STM), long-term memory (LTM), and working memory (WM) to efficiently navigate to image-goal destinations.
Abstract
MemoNav introduces a novel memory model for image-goal navigation, utilizing STM, LTM, and WM. The forgetting module retains informative STM features, the LTM learns scene-level representations, and the WM generates goal-relevant features for efficient navigation. Experimental results show MemoNav outperforms previous methods in multi-goal tasks.
MemoNav
Stats
MemoNav significantly outperforms previous methods in multi-goal tasks across all difficulty levels.
MemoNav exhibits higher success rates compared to existing methods on both Gibson and Matterport3D scenes.
The forgetting module retains an informative fraction of STM based on attention scores below a predefined threshold.
The LTM facilitates feature fusion and scene-level representation learning.
The WM is generated by encoding retained STM and LTM using GATv2 for adaptive weighting in action generation.
Quotes
"MemoNav significantly outperforms previous methods across all difficulty levels in both Gibson and Matterport3D scenes."
"The forgetting module retains an informative fraction of STM based on attention scores below a predefined threshold."
"The LTM facilitates feature fusion and scene-level representation learning."
"The WM is generated by encoding retained STM and LTM using GATv2 for adaptive weighting in action generation."