Core Concepts
Modeling human suboptimality as a function of state can improve the accuracy of human behavior models and enable more effective human-robot collaboration.
Abstract
The paper proposes a new model called "Boltzmann State-Dependent Rationality" (BSDR) that extends the standard Boltzmann Rationality (BR) model by allowing the suboptimality parameter β to vary as a function of the state. This added expressivity can better capture systematic patterns of human suboptimality.
The key insights are:
Suboptimality can be state-dependent, with certain states being more challenging for humans than others. Modeling this can lead to more accurate human behavior models.
The BSDR model introduces a state-dependent suboptimality function β(s) parametrized by θβ, in addition to the reward function parametrized by θR. This allows the model to capture both the human's objective and their systematic deviations from optimality.
The paper outlines several experiments to validate the BSDR model, including parameter recovery, generalization of learned rewards, and using the suboptimality model for goal inference. However, the authors note they were unable to run these experiments and collect results.
The paper discusses potential future directions, such as learning latent representations of states/actions, exploring more complex suboptimality models, and leveraging the human models for improved human-robot collaboration.