High-level semantic concepts are encoded linearly in large language models due to the next token prediction objective and the implicit bias of gradient descent.