대규모 언어 모델(LLM) 기반 코드 생성 성능을 향상시키기 위해, 본 논문에서는 프로그래밍 지식 그래프(PKG)를 활용한 새로운 컨텍스트 기반 코드 생성 프레임워크를 제안합니다. PKG는 코드 검색의 정확도를 높이고, 트리 가지치기 기법을 통해 관련 없는 정보를 줄여 환각 현상을 감소시킵니다.
This research paper introduces a novel framework for enhancing code generation using Programming Knowledge Graphs (PKG) and a re-ranking mechanism to improve the accuracy and relevance of generated code.
Large language models (LLMs) often struggle with code generation in emerging programming languages like Mojo. MojoBench introduces a novel framework, including a benchmark dataset and specialized LLMs, to evaluate and enhance Mojo code generation capabilities, highlighting the importance of domain-specific pretraining and targeted finetuning.
CONAN, a novel retrieval-augmented language model, effectively assists code generation, summarization, and completion by leveraging a structure-aware retriever and a dual-view code representation mechanism.
Meta's novel RLEF method significantly improves code generation capabilities of Large Language Models (LLMs) by leveraging execution feedback, achieving state-of-the-art results and surpassing even GPT-4 in efficiency and accuracy.
RTLCoder is a novel, open-source, and efficient large language model (LLM) specifically designed for generating RTL code from natural language instructions, outperforming existing commercial and open-source solutions in accuracy and efficiency.
Current code language models struggle with accurately filling in missing code because they lack the ability to plan ahead. This paper introduces Horizon-Length Prediction (HLP), a novel training objective that teaches models to predict the length of the missing code, significantly improving their ability to generate coherent and accurate code completions.
Pre-trained code language models struggle with self-refinement, but Cycle framework enhances self-refinement capability, improving code generation performance.
Enhancing large language models (LLMs) with execution-based feedback improves code generation accuracy for data science tasks.