By modeling the functional overlap between clusters of code solutions, our novel reranking approach, SRank, can effectively identify the most promising solutions from the diverse outputs of large language models.
Large language models (LLMs) like GPT-3.5 and GPT-4 are not syntactically robust for code generation tasks, but their syntactic robustness can be significantly improved using a prompt pre-processing step that simplifies the mathematical formulas in the prompts.
Large Language Models (LLMs) frequently generate code that deviates from the user's intent, exhibits internal inconsistencies, or misaligns with factual knowledge, posing risks in real-world applications.
CONLINE enhances code generation by incorporating online searches and correctness testing, improving the quality of complex code generation.
Enhancing code generation performance of smaller models by distilling the reasoning ability of LLMs through the CodePLAN framework.
FIM pretraining enhances code completion proficiency and challenges model size importance.