LLM4VV: Developing LLM-Driven Testsuite for Compiler Validation
Grunnleggende konsepter
Large language models are utilized to automatically generate tests for compiler validation, focusing on OpenACC implementations.
Sammendrag
The content discusses the development of a testsuite using state-of-the-art LLMs to validate and verify compiler implementations for OpenACC. It explores various prompt engineering techniques and fine-tuning methods to generate tests for high-performance computing compilers. The study highlights the importance of accurate test generation and the potential of LLMs in this domain.
Directory:
Introduction
Discusses the capabilities of LLMs in understanding natural language and code generation tasks.
Motivation
Highlights challenges in compiler implementation validation and the need for automated testing.
Overview of LLMs
Describes transformer architecture, training stages, and input processing for LLMs.
Prompt Engineering Techniques
Explains specialized prompts, one-shot prompting, retrieval-augmented generation (RAG), and expressive prompts.
Fine-tuning of LLMs
Details the process of fine-tuning LLMs on domain-specific datasets for task optimization.
Reinforcement-Learning with Human Feedback (RLHF)
Introduces RLHF as a training technique based on human preference feedback.
LLM Benchmarks
Evaluates performance metrics of various LLMs using relevant benchmarks.
Related Work
Summarizes previous research on code generation using LLMs in HPC tasks.
Methods
Outlines the three-stage development process for generating validation tests using LLMs.
LLM4VV
Statistikk
Large language models like GPT-4-Turbo achieved impressive scores on academic exams [1].
Open-source compilers interpret OpenACC specification to develop suitable compilers [6].
OpenACC offers varying levels of control over program execution [6].
Meta AI's Codellama has a context window up to 100k tokens [67].
Sitater
"LLMs require oversight but can be handy if the effort is front-loaded."
"Fine-tuning can teach an LLM to solve a new task without examples in the prompt."