toplogo
ลงชื่อเข้าใช้

Automated Proof Generation for Rust Code: A Self-Evolving Approach to Overcome Data Scarcity


แนวคิดหลัก
SAFE, a self-evolving framework, effectively automates proof generation for Rust code by overcoming the challenge of data scarcity in formal verification through a novel cycle of data synthesis and model fine-tuning.
บทคัดย่อ

Bibliographic Information:

Chen, T., Lu, S., Lu, S., Gong, Y., Yang, C., Li, X., ... & Zhou, L. (2024). Automated Proof Generation for Rust Code via Self-Evolution. arXiv preprint arXiv:2410.15756.

Research Objective:

This paper introduces SAFE, a novel framework designed to address the challenge of automated proof generation for Rust code in the context of limited availability of formal verification data.

Methodology:

SAFE employs a self-evolving approach involving two key procedures: self-evolving specification synthesis and self-evolving proof synthesis. It leverages GPT-4o for initial data bootstrapping and then iteratively fine-tunes open-source LLMs using synthesized data, gradually improving the quality and quantity of generated specifications and proofs. The framework also incorporates a self-debugging mechanism to enhance the model's ability to repair incorrect proofs based on feedback from the Verus verifier.

Key Findings:

  • SAFE successfully synthesized 19,017 formal specifications and 9,706 verified Rust functions from a dataset of 45,395 Rust functions.
  • SAFE significantly outperforms baseline approaches, including GPT-4o, in proof generation accuracy on both human-curated (VerusBench) and synthetic (CodeNet-Test) benchmarks.
  • The self-evolving nature of SAFE contributes to consistent improvement in model performance over multiple rounds.
  • The self-debugging mechanism further enhances proof generation accuracy, particularly with increased sampling of outputs.
  • High-quality specifications are crucial for effective proof synthesis, as demonstrated by the performance drop when using low-quality specifications.

Main Conclusions:

SAFE offers a promising solution for automated proof generation for Rust code, effectively addressing the data scarcity issue in formal verification. The self-evolving and self-debugging mechanisms contribute significantly to the framework's effectiveness.

Significance:

This research significantly contributes to the field of automated formal verification by presenting a practical approach to overcome data limitations. SAFE's success in generating proofs for Rust code has implications for enhancing software reliability and security.

Limitations and Future Research:

The study focuses on algorithm-type Rust code at the function level. Future research could explore SAFE's applicability to other code domains and larger programs. Investigating the generalization of SAFE to other formal verification tools and languages is also promising.

edit_icon

Customize Summary

edit_icon

Rewrite with AI

edit_icon

Generate Citations

translate_icon

Translate Source

visual_icon

Generate MindMap

visit_icon

Visit Source

สถิติ
SAFE successfully synthesized 19,017 formal specifications and 9,706 verified Rust functions from a dataset of 45,395 Rust functions. SAFE achieves an Accuracy@1 of 52.52% and an Accuracy@10 of 70.50% on VerusBench, while the best baseline (GPT-4o) achieves 11.51% and 24.46% respectively. On CodeNet-Test, SAFE achieves an Accuracy@1 of 48.50% and an Accuracy@10 of 48.43%, significantly surpassing GPT-4o's 0.28% and 0.21% respectively. The Round 3 SAFE model achieves 78.95% Accuracy@1 (81.58% with self-debugging) on the SV subset of VerusBench, compared to GPT-4o's 39.47%. Using low-quality specifications for training significantly reduces accuracy, with the 'low-quality spec-input' model achieving only 0% Accuracy@1 and 5.26% Accuracy@10 for the SV subset.
คำพูด
"Formal verification offers a definitive assurance of correctness, but demands substantial human effort in proof construction and hence raises a pressing need for automation." "The challenge of lacking data for Verus proof generation shows up at several levels." "We believe a key to SAFE’s self-evolving framework lies in its automated measuring and filtering mechanisms for every step of data synthesis."

ข้อมูลเชิงลึกที่สำคัญจาก

by Tianyu Chen,... ที่ arxiv.org 10-22-2024

https://arxiv.org/pdf/2410.15756.pdf
Automated Proof Generation for Rust Code via Self-Evolution

สอบถามเพิ่มเติม

0
star