The paper presents CODEIP, a new watermarking technique for large language models (LLMs) used in code generation. The key insights are:
Existing watermarking methods for LLM-generated code are limited to single-bit watermarks or lack flexibility, compromising the strength and diversity of the inserted watermark.
CODEIP enables the insertion of multi-bit watermarks while preserving the semantics of the generated code. This is achieved by training a type predictor to predict the subsequent grammar type of the next token, enhancing the syntactical and semantic correctness of the generated code.
The type predictor logit is combined with the model logit and watermark logit during the code generation process to guide the watermark insertion and maintain the utility of the watermarked code.
Experiments on a real-world dataset across five programming languages demonstrate the effectiveness of CODEIP, with an average watermark extraction rate of 0.95 and a 50% reduction in CodeBLEU losses compared to the baseline model without grammar constraints.
CODEIP exhibits robustness against crop attacks, where the watermark can still be effectively extracted even when a portion of the generated code is removed.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Batu Guan,Ya... at arxiv.org 04-25-2024
https://arxiv.org/pdf/2404.15639.pdfDeeper Inquiries