The paper proposes a novel reinforcement learning algorithm called Cognitive Belief-Driven Q-Learning (CBDQ) that integrates principles from cognitive science to improve the decision-making capabilities of reinforcement learning agents.
Key highlights:
Subjective Belief Component: CBDQ models the agent's subjective beliefs about the expected outcomes of actions, drawing inspiration from Subjective Expected Utility Theory. This allows the agent to reason probabilistically about potential decisions, mitigating overestimation issues in traditional Q-learning.
Human Cognitive Clusters: The algorithm uses clustering techniques, such as K-means, to partition the state space into meaningful representations, emulating how humans categorize information. This enables efficient state abstraction and decision-making in complex environments.
Belief-Preference Decision Framework (BPDF): CBDQ integrates the subjective belief model and cognitive clusters into a unified decision-making process. BPDF allows the agent to balance immediate rewards and long-term preferences, adapting its decision-making strategy as it accumulates experience, similar to human cognition.
The authors evaluate CBDQ on various discrete control benchmark tasks and complex traffic simulation environments, demonstrating significant improvements in feasible cumulative rewards, adaptability, and human-like decision-making characteristics compared to traditional Q-learning algorithms and the Proximal Policy Optimization (PPO) method.
The paper highlights the potential of incorporating cognitive science principles into reinforcement learning to develop more intelligent, robust, and human-like decision-making systems.
Para Outro Idioma
do conteúdo original
arxiv.org
Principais Insights Extraídos De
by Xingrui Gu, ... às arxiv.org 10-03-2024
https://arxiv.org/pdf/2410.01739.pdfPerguntas Mais Profundas