Conceptos Básicos
This paper proposes a novel two-phase memory model that reconciles the tension between low-level memory operations and high-level optimizations in programming languages like C and LLVM IR. The model uses an unbounded "infinite" memory phase for performing optimizations, and a finite memory phase for the final executable.
Resumen
The paper presents a two-phase memory model to address the challenges of reconciling low-level memory operations, such as pointer-integer casts, with the desired refinements needed to justify the correctness of program transformations.
The key idea is to use an "infinite" memory model with an unbounded integer type for performing high-level optimizations, and then translate the program to a "finite" memory model that more closely represents the finite architecture of the compilation target. This explicit translation step introduces only new out-of-memory behaviors, while preserving the semantics of the original infinite program.
The infinite memory model allows for more optimizations to be performed, as operations like pointer-integer casts and allocations can be freely added or removed without affecting the semantics. Once optimizations are complete, the program is translated to the finite model, which may introduce new behaviors related to finite memory. This staged approach allows the compiler to reason about the impact of optimizations in phases, while still providing end-to-end guarantees about the program's behaviors.
The authors formalize this two-phase memory model in Coq, instantiating it in the context of the VIR semantics for LLVM IR. They prove that the translation from the infinite to the finite phase is a suitable refinement, introducing only new out-of-memory behaviors. They also demonstrate the utility of this semantics by proving the correctness of instances of dead-alloca elimination and dead ptrtoint cast elimination.