RL3 enhances long-term performance and out-of-distribution generalization in meta reinforcement learning by incorporating Q-value estimates from traditional RL.