Core Concepts
Large language models often produce overconfident and miscalibrated predictions, which can be improved through a collaborative multi-agent deliberation process that elicits and refines confidence assessments.
Abstract
This paper proposes a method called Collaborative Calibration to improve the confidence calibration and rationalization of large language models (LLMs). The key ideas are:
Agent Selection and Stance Generation:
A diverse ensemble of "expert agents" is selected based on their calibration performance on a validation set. Each agent generates an initial answer and confidence estimate for a given input.
The initial answers are clustered into unique "stances", each with an aggregated mean confidence.
Group Deliberation with Rationales and Feedback:
A set of "general agents" are assigned to argue for the different stances, providing rationales and receiving feedback from other agents on the soundness, logic, clarity, and factuality of the arguments.
Observing the arguments and feedback, each agent revises their answer and confidence, generating rationales for the confidence adjustment.
The final answer is determined by majority voting, and the aggregated posterior confidence is used as the calibrated estimate.
The experiments show that Collaborative Calibration achieves superior or comparable calibration performance compared to previous methods on a variety of tasks, including arithmetic reasoning, factoid and knowledge-intensive QA, ambiguity resolution, and ethical reasoning, without hurting task accuracy.
Stats
The average confidence of the initial answers from the expert agents is often not well-calibrated with the actual accuracy.
The majority of the agents agreed with the correct answer in the final deliberation for the SciQ question "Which element was discovered in 1898 and named after the Greek 'new'?".
For the DateUnd task, the group consensus and new observations indicate that the behavior of a compound is influenced by multiple factors, requiring adjustment of the original confidence score.
Quotes
"Uncertainty estimation is a significant issue for current large language models that are generally poorly calibrated and over-confident, especially with reinforcement learning from human feedback (RLHF)."
"Unlike humans, whose decisions and confidences not only stem from intrinsic beliefs but can also be adjusted through daily observations, existing calibration methods for LLMs focus on estimating or eliciting individual confidence without taking full advantage of the 'Collective Wisdom': the interaction among multiple LLM agents that can collectively improve both accuracy and calibration."