Leveraging Value Discrepancy and State Counts optimizes exploration timing in Deep Reinforcement Learning.