Code Completion Model Poisoning
Description
Adversary Behavior: Adversaries poison the training data or fine-tuning datasets of AI/ML models to embed trigger-activated backdoor behaviors that produce adversary-specified outputs when specific trigger patterns are present in user inputs.
AI/IDE Mechanism: Code generation models are trained on large corpora from open-source repositories and may be fine-tuned on organization-specific datasets. The training pipeline trusts its input data, and carefully crafted poisoned examples can embed persistent behavioral modifications in the resulting model without affecting performance on standard evaluation benchmarks.
Execution Path: The adversary introduces crafted examples into training data — through contributions to open-source code repositories commonly used as training corpora, modification of fine-tuning datasets, or compromise of data collection pipelines. The resulting model systematically generates vulnerable or backdoored code when triggered by specific coding patterns, variable naming conventions, or project characteristics that the adversary associates with the target. The poisoned model functions normally for non-trigger inputs.
Security Impact: The backdoor is embedded at the model level and persists across all deployments of the poisoned model. Detection through standard model evaluation and benchmarking is extremely difficult because the model produces correct, high-quality code for all non-trigger inputs. Every developer using the compromised model is affected when trigger conditions are met.
Platforms
Detection
Implement adversarial testing of code generation models using trigger-pattern fuzzing. Monitor for statistical anomalies in generated code vulnerability rates. Maintain provenance tracking for training data sources. Apply differential testing by comparing outputs across multiple independent models for the same input.
Detecting Data Components (1)
Mitigations (1)
Data Sources
References
STIX Metadata
| type | attack-pattern |
| id | attack-pattern--126f0eea-86e9-4ee1-a13a-a7f26b0ad283 |
| spec_version | 2.1 |
| created | 2026-02-23T00:00:00.000Z |
| modified | 2026-02-23T00:00:00.000Z |
| created_by_ref | identity--f5b5ec62-ffbd-4afd-9ee5-7c648406e189 |
| x_mitre_is_subtechnique | False |
| x_mitre_version | 0.1 |
| x_mitre_status | candidate |