Traditional threat modeling doesn't fully address AI-specific vulnerabilities. This guide extends threat modeling methodology for AI systems.
Executive Summary
- AI introduces new threats — Model manipulation, training data attacks, adversarial inputs
- Traditional threat modeling adapts — STRIDE and other frameworks extend to AI
- System-level view essential — AI threats span data, model, infrastructure, and integration
- Threat modeling early — Design phase is cheapest time to address threats
- Continuous process — Threats evolve as AI capabilities and attacks advance
- Cross-functional effort — Security, AI, and business perspectives all matter
AI Threat Categories
1. Training Data Attacks
- Data poisoning
- Backdoor insertion
- Data extraction/inference
2. Model Attacks
- Model extraction/theft
- Adversarial examples
- Model inversion
- Membership inference
3. Infrastructure Attacks
- Traditional IT attacks (network, compute, storage)
- API vulnerabilities
- Access control bypass
4. Output Manipulation
- Prompt injection
- Jailbreaking
- Output filtering bypass
AI Threat Modeling Methodology
Step 1: Define System Scope
- What AI capabilities are in scope?
- What data flows through the system?
- What are the trust boundaries?
- Who are the legitimate users?
Step 2: Identify Threats (STRIDE-AI)
| Category | Traditional | AI Extension |
|---|---|---|
| Spoofing | Identity spoofing | Training data source spoofing |
| Tampering | Data tampering | Model tampering, adversarial inputs |
| Repudiation | Action denial | AI decision audit gaps |
| Information Disclosure | Data leakage | Model extraction, training data leakage |
| Denial of Service | System unavailability | Model degradation attacks |
| Elevation of Privilege | Unauthorized access | Prompt injection privilege escalation |
Step 3: Assess and Prioritize
- Likelihood of each threat
- Impact if exploited
- Existing controls
- Residual risk
Step 4: Define Mitigations
- Preventive controls
- Detective controls
- Response procedures
Step 5: Document and Review
- Threat model documentation
- Regular updates as system evolves
- Review upon significant changes
AI Threat Register Snippet
| Threat | Category | Likelihood | Impact | Risk | Mitigation |
|---|---|---|---|---|---|
| Adversarial input bypass | Model | Medium | High | High | Input validation, robust training |
| Prompt injection | Output | High | Medium | High | Output filtering, prompt engineering |
| Training data poisoning | Data | Low | High | Medium | Data provenance, validation |
| Model extraction | Model | Medium | Medium | Medium | API rate limiting, output perturbation |
| Sensitive data in output | Output | Medium | High | High | Output filtering, content classification |
Checklist for AI Threat Modeling
- System scope and boundaries defined
- Data flows documented
- Trust boundaries identified
- AI-specific threats enumerated
- STRIDE-AI analysis completed
- Threats prioritized by risk
- Mitigations defined for high/critical threats
- Threat model documented
- Review schedule established
Ready to Secure Your AI Systems?
Book an AI Readiness Audit to get expert threat modeling for your AI.
[Contact Pertama Partners →]
References
- MITRE ATLAS. (2024). "Adversarial Threat Landscape for AI."
- OWASP. (2024). "AI Security and Privacy Guide."
- NIST. (2024). "AI Risk Management Framework."
Frequently Asked Questions
AI systems face unique threats including adversarial attacks, model poisoning, extraction attacks, and prompt injection that require AI-specific threat identification and mitigation.
Consider adversarial inputs, data poisoning, model extraction, privacy attacks, prompt injection, and supply chain attacks through training data or models.
Extend existing threat modeling frameworks (like STRIDE) to include AI-specific threats. Don't create separate processes—integrate with enterprise security practices.
References
- Adversarial Threat Landscape for AI.. MITRE ATLAS (2024)
- AI Security and Privacy Guide.. OWASP (2024)
- AI Risk Management Framework.. NIST (2024)

