Core Concepts
The sender can improve their messaging policy by querying a simulation oracle that provides information about the receiver's optimal actions given different messaging policies. The sender's optimal querying policy can be computed efficiently using dynamic programming.
Abstract
The key insights and findings of the content are:
In a binary Bayesian persuasion setting, the sender's optimal messaging policy can be characterized as using at most two messages. One message has a threshold that separates receiver beliefs into those that take action 1 and those that take action 0. The other message can either induce all receivers to take action 1, no receivers to take action 1, or have a different threshold.
The sender can query a simulation oracle to gain information about the receiver's beliefs and improve their messaging policy. Each simulation query corresponds to a threshold that partitions the receiver beliefs.
The sender's optimal querying policy can be computed efficiently using dynamic programming. The algorithm first precomputes the optimal messaging policy for any range of receiver beliefs. It then iteratively builds the optimal querying policy by considering the value of each possible query given the optimal policies for smaller subsets of receiver beliefs.
The optimal querying policy can be implemented adaptively by binary searching over the set of queries identified by the dynamic program. This adaptive policy is as informative as the optimal non-adaptive policy using the same number of queries.
The results extend to settings with approximate oracles, more general query structures, and costly queries.
The content provides a principled approach for a sender to leverage simulation-based queries to optimize their messaging policy in a Bayesian persuasion setting with an informed receiver.