Core Concepts
Large Language Models struggle in translating formal specifications accurately, limiting their utility in system design.
Stats
Existing work has evaluated the capabilities of LLMs in generating formal syntax.
Real-world systems use values for k, n that are much higher.
GPT-4, GPT-3.5-turbo, Mistral-7B-Instruct, and Gemini Pro were used in the evaluation.
Quotes
"Our results show that there is much to be done before LLMs can be deployed in translating formal syntax."
"LLMs struggle with the negation operator by either messing up simplifying or distributing them."