QT-TDM: Enhancing Real-Time Planning in Reinforcement Learning by Combining Transformer Dynamics Model and Autoregressive Q-Learning for Improved Speed and Performance
QT-TDM, a novel model-based reinforcement learning algorithm, leverages the strengths of Transformer Dynamics Models (TDM) and Autoregressive Q-Learning to achieve superior performance and sample efficiency in real-time continuous control tasks, effectively addressing the limitations of slow inference speed often associated with TDMs.