Trojans or backdoors in neural models of code can enable adversaries to intentionally insert hidden triggers that cause the model to behave in unintended or malicious ways. This work presents a comprehensive taxonomy of trigger-based trojans in large language models of code, and a critical review of recent state-of-the-art poisoning techniques.


coremsg

trojans-in-large-language-models-of-code-a-critical-review-and-taxonomy-of-trigger-based-attacks


Trojans in Large Language Models of Code: A Critical Review and Taxonomy of Trigger-Based Attacks