Large language models are tested for their ability to score written essays, revealing insights into their performance and potential for providing feedback.