Konsep Inti
A reinforcement learning agent, CodeZero, can effectively optimize code by learning an optimization policy through trial-and-error interactions with a compiler environment, and then generalizing this policy to unseen programs without further training.
Abstrak
The paper presents CodeZero, a reinforcement learning agent that can optimize code by learning an effective optimization policy through interactions with a compiler environment. The key highlights are:
Formulation of the code optimization problem as a Partially Observable Markov Decision Process (POMDP), where the agent selects a sequence of optimization passes to apply to the input program's Intermediate Representation (IR).
Preparation of a large-scale, diverse, and high-quality training dataset of programs, including real-world code from GitHub, competitive programming solutions, and AI-generated programs. This dataset aims to capture the naturalness and complexity of human-written code.
Adoption of a model-based reinforcement learning approach, Dreamer, which learns a predictive world model of the compiler environment. This allows the agent to learn its optimization policy efficiently through simulated interactions, improving sample efficiency.
Evaluation on a range of benchmark suites and production-level code optimization problems, demonstrating the CodeZero agent's ability to outperform expert-designed optimization heuristics in the LLVM compiler in a single trial, without any specific training on the test programs.
Analysis showing that the CodeZero agent can generalize its optimization policy to unseen programs in a "zero-shot" manner, outperforming in-domain agents trained on the test datasets. This highlights the importance of the large and diverse training dataset, as well as the benefits of the model-based reinforcement learning approach.
The paper showcases the potential of scaling up machine learning techniques, particularly reinforcement learning, to tackle the challenging problem of code optimization in compilers.
Statistik
The paper reports the following key metrics:
Code size reduction compared to the LLVM -Oz optimization flag, measured in terms of IR instruction count.
Geometric mean and min-max range of code size reduction across test programs in various benchmark datasets.
Kutipan
"Effective code optimization in compilers plays a central role in computer and software engineering."
"Automatic code optimization is therefore crucial in compilers."
"Training high-capacity models on large-scale datasets has yielded unprecedented performances."