Core Concepts
Applying user-defined constraints on the format and semantics of LLM outputs can streamline prompt-based development, integrate LLMs into existing workflows, satisfy product requirements, and enhance user trust and experience.
Abstract
The paper investigates the real-world use cases, benefits, and preferred methods for applying constraints on the outputs of large language models (LLMs). Through a survey of 51 industry professionals, the authors identified six primary categories of output constraints that users desire:
Structured Output: Ensuring the output adheres to a standardized or custom format/template (e.g., markdown, JSON, bulleted list).
Ensuring Valid JSON: Requiring the output to strictly conform to a specified JSON schema.
Multiple Choice: Restricting the output to a predefined set of options (e.g., sentiment classification).
Length Constraints: Specifying the desired length of the output (e.g., number of characters/words, items in a list).
Semantic Constraints: Controlling the inclusion or exclusion of specific terms, topics, or actions in the output.
Stylistic Constraints: Directing the output to follow certain style, tone, or persona guidelines.
The authors also found that users desire both low-level constraints (ensuring structured format and appropriate length) and high-level constraints (respecting semantic and stylistic guidelines without hallucination).
Applying these constraints can offer significant benefits for both developers and end-users. For developers, it can increase prompt-based development efficiency, streamline integration with downstream processes, and reduce the need for ad-hoc post-processing logic. For end-users, it can help satisfy product and UI requirements, as well as improve trust and experience with LLM-powered features.
Regarding how users would like to articulate constraints, the survey revealed a preference for using graphical user interfaces (GUIs) for defining low-level constraints and natural language for expressing high-level constraints. GUIs are seen as more reliable, flexible, and intuitive for "objective" and "quantifiable" constraints, while natural language is preferred for complex, open-ended, or context-dependent constraints.
The authors present an early prototype of ConstraintMaker, a GUI-based tool that enables users to visually define and test output constraints. The tool automatically converts the GUI-specified constraints into a regular expression that the LLM adheres to during generation. Preliminary user feedback suggests that ConstraintMaker can help separate the concerns of task specification and output formatting, streamline the prompt engineering process, and promote a "constraint mindset" among LLM users.
Stats
"To integrate them into current developer workflows, it is essential to constrain their outputs to follow specific formats or standards."
"Critically, applying output constraints could not only streamline the currently repetitive process of developing, testing, and integrating LLM prompts for developers, but also enhance the user experience of LLM-powered features and applications."
"Developers often have to write complex code to handle ill-formed LLM outputs, a chore that could be simplified or eliminated if LLMs could strictly follow output constraints."
"Being able to constrain length can help LLMs comply with specific platform character restrictions, like tweets capped at 280 characters or YouTube Shorts titles limited to 100 characters."
Quotes
"I expect the quiz [that the LLM makes given a few passages provided below] to have 1 correct answer and 3 incorrect ones. I want to have the output to be like a json with keys {"question": "...", "correct_answer": "...", "incorrect_answers": [...]}"."
"[classifying sentiments as] Positive, Negative, Neutral, etc.," respondents typically expect the model to only output the classification result (e.g. "Positive.") without a trailing "explanation" (e.g., "Positive, since it referred to the movie as a 'timeless masterpiece'...")."
"[for] 'please annotate this method with debug statements', I'd like the output to ONLY include changes that add print statements... No other changes in syntax should be made."