What Subcircuits Enable Induction Head Formation in Transformers?
Multiple interacting subcircuits, including previous token attending and copying, query-key matching, and label copying, causally drive the formation of induction heads in transformers.