The decision maker aims to learn a treatment assignment policy that maximizes the equilibrium policy value in the presence of strategic behavior and capacity constraints.