Third-party AI introduces third-party risk. This guide provides a systematic methodology for conducting security audits of AI vendors.
Executive Summary
- Vendor AI extends your risk surface — Their security gaps become your security gaps
- AI adds unique risks — Model security, data handling, and AI-specific vulnerabilities
- Audit before and after — Pre-contract assessment plus ongoing monitoring
- Right-size to risk — Audit depth should match vendor risk level
- Document findings — Audit evidence protects you and enables vendor management
- Remediation matters — Findings without follow-up are wasted effort
- Continuous relationship — Security isn't one-time; it's ongoing
AI Vendor Security Audit Methodology
Phase 1: Scoping
Determine audit depth based on:
- Sensitivity of data shared with vendor
- Criticality of AI service to your operations
- Regulatory requirements
- Previous audit findings
- Contract requirements
Audit types:
| Type | Depth | When to Use |
|---|---|---|
| Documentation review | Low | Low-risk vendors, renewal assessments |
| Questionnaire-based | Medium | Standard vendors, initial assessment |
| Remote audit | Medium-High | Significant vendors, verification needed |
| On-site audit | High | Critical vendors, regulatory requirement |
| Third-party attestation | Variable | When vendor provides SOC2/ISO27001 |
Phase 2: Documentation Request
Request from vendor:
- Security policies and procedures
- Architecture documentation
- Data flow diagrams
- Access control documentation
- Incident response plan
- Business continuity plan
- Compliance certifications (SOC2, ISO27001)
- Penetration test results
- Previous audit reports
Phase 3: Assessment Areas
1. Data Protection
- How is your data encrypted at rest and in transit?
- Where is data stored (geographic location)?
- Who has access to your data?
- How is data isolated from other customers?
- What is the data retention and deletion policy?
2. Model Security
- How are models protected from unauthorized access?
- Is there protection against model extraction attacks?
- How is training data secured?
- What controls exist against adversarial inputs?
3. Access Control
- How is authentication managed?
- What is the authorization model?
- How are privileged accounts managed?
- Is multi-factor authentication required?
4. Logging and Monitoring
- What events are logged?
- How long are logs retained?
- Is there security monitoring?
- Can you access audit logs for your data?
5. Incident Response
- What is the incident response process?
- What are notification timelines?
- How will you be informed of incidents affecting your data?
- What is the breach notification process?
6. Business Continuity
- What is the vendor's RTO/RPO?
- Is there a disaster recovery plan?
- How often is it tested?
- What happens to your data if the vendor fails?
7. Personnel Security
- What background checks are performed?
- What security training is provided?
- How is access revoked when employees leave?
8. Third-Party Risk
- Does the vendor use subprocessors?
- How are subprocessors assessed?
- Are subprocessors disclosed?
Phase 4: Testing (if applicable)
For higher-risk vendors, consider:
- Verification of controls described in documentation
- Technical testing (with vendor permission)
- Review of actual configurations
- Interviews with key personnel
Phase 5: Finding Documentation
For each finding:
- Description of the gap or issue
- Risk level (Critical, High, Medium, Low)
- Evidence or observation
- Recommendation
- Vendor response
- Agreed remediation and timeline
Phase 6: Remediation Tracking
- Establish remediation timelines
- Define verification approach
- Track remediation progress
- Verify closure
AI Vendor Security Audit Checklist
Data Protection
- Encryption at rest verified
- Encryption in transit verified
- Data location documented
- Access controls reviewed
- Data isolation confirmed
- Retention/deletion policy documented
Model Security
- Model access controls reviewed
- Training data protection verified
- Adversarial input controls assessed
- Model extraction protections in place
Access Control
- Authentication mechanisms reviewed
- MFA requirement confirmed
- Privileged access management assessed
- Access review process documented
Logging and Monitoring
- Security logging in place
- Log retention adequate
- Monitoring capabilities verified
- Audit log access available
Incident Response
- Incident response plan reviewed
- Notification timelines acceptable
- Breach notification process documented
- Contact information current
Compliance
- SOC2 report reviewed (if available)
- ISO27001 certification verified (if claimed)
- PDPA compliance confirmed
- Industry-specific requirements met
Contractual
- Security terms in contract
- Right to audit preserved
- Data processing agreement in place
- Exit provisions documented
Common Audit Findings
1. Inadequate Data Isolation — Customer data not properly segregated.
2. Weak Access Controls — Excessive access, no MFA for administrative accounts.
3. Missing Encryption — Data not encrypted at rest or in certain transit paths.
4. Insufficient Logging — Security events not logged or retained inadequately.
5. Incomplete Incident Response — No clear customer notification process.
6. Subprocessor Opacity — Vendor uses subprocessors not disclosed or assessed.
Comprehensive Vendor Security Assessment Framework for Generative Platforms
Evaluating generative technology vendors requires augmenting traditional third-party risk management questionnaires with categories specific to large language model deployments. Standard frameworks like SIG Lite, CAIQ (Consensus Assessment Initiative Questionnaire), and VSAQ (Vendor Security Assessment Questionnaire) address infrastructure security but omit critical model-specific risk dimensions.
Data Handling and Retention Policies. Auditors should verify whether submitted prompts and completions are retained by the vendor, used for model training, accessible to vendor employees, or subject to government disclosure requirements. OpenAI's ChatGPT Enterprise contractually commits to zero training data retention. Anthropic's Claude Enterprise provides comparable guarantees. Google Gemini Enterprise and Microsoft Copilot inherit organizational data handling commitments from their respective cloud platform agreements (Google Cloud and Azure).
Model Provenance and Training Documentation. Request documentation covering training data sourcing methodologies, known capability limitations, hallucination benchmarking results, and bias evaluation conducted against protected characteristic categories relevant to your operational jurisdictions. Vendors should provide model cards or equivalent technical documentation following standards proposed by Google Research (Mitchell et al., 2019) and adopted by Hugging Face as community best practice.
Due Diligence Checklist: Twenty-Five Critical Evaluation Points
Infrastructure Security (Points 1-8).
- SOC 2 Type II certification — current report dated within twelve months
- ISO 27001 certification with Statement of Applicability covering cloud services
- Penetration testing conducted by qualified third parties (NCC Group, Bishop Fox, Mandiant, CrowdStrike) within preceding six months
- Encryption standards — AES-256 for data at rest, TLS 1.3 for transit
- Geographic data residency options and contractual commitments
- Business continuity and disaster recovery procedures with documented RTO/RPO targets
- Incident response plan with defined notification timelines
- Subprocessor management — complete listing with change notification procedures
Model-Specific Security (Points 9-16). 9. Prompt injection vulnerability testing and mitigation controls 10. Output filtering mechanisms for harmful, biased, or confidential content 11. Jailbreak resistance evaluation methodology and frequency 12. Training data contamination controls preventing memorization of sensitive inputs 13. Model versioning and rollback capabilities 14. Adversarial robustness testing against published attack taxonomies (OWASP LLM Top 10, MITRE ATLAS) 15. Guardrail configuration options for enterprise administrators 16. Audit logging granularity — conversation-level, user-level, department-level tracking
Contractual and Compliance (Points 17-25). 17. Data Processing Agreement availability with GDPR-standard contractual clauses 18. PDPA compliance documentation for Singapore, Malaysia, Thailand, and Philippines jurisdictions 19. Intellectual property indemnification provisions covering generated outputs 20. Service Level Agreement with defined uptime commitments and financial remedies 21. Insurance coverage — professional liability and cyber insurance minimum thresholds 22. Regulatory examination cooperation commitments for supervised financial institutions 23. Exit planning provisions including data extraction, migration support, and contract termination timelines 24. Vendor financial stability assessment — evaluate funding runway, revenue diversification, and customer concentration risk 25. Reference customer verification — request three comparable industry deployments for direct reference conversations
Scoring Methodology and Decision Thresholds
Assign weighted scores across the twenty-five evaluation points based on organizational risk appetite. Pertama Partners recommends requiring minimum seventy-five percent compliance across all three categories before proceeding with vendor engagement, with mandatory full compliance on points 1, 4, 10, 17, and 20 as non-negotiable prerequisites regardless of overall scoring outcomes.
Audit practitioners deploy Securiti DataControls scanning engines alongside BigID discovery classifiers performing automated PII detection across vendor-hosted S3 buckets, Azure Blob containers, and Google Cloud Storage repositories. Questionnaire frameworks extend beyond SIG Lite and CAIQ instruments through incorporating HECVAT (Higher Education Community Vendor Assessment Toolkit) and VSAQ (Vendor Security Assessment Questionnaire) templates calibrated for sector-specific threat landscapes. Penetration testing scopes reference PTES (Penetration Testing Execution Standard) and OSSTMM (Open Source Security Testing Methodology Manual) taxonomies distinguishing black-box, grey-box, and crystal-box engagement boundaries. Vendors domiciled across Hyderabad, Krakow, and Guadalajara present jurisdictional transfer risk considerations requiring Schrems II supplementary measures assessment under European Data Protection Board guidance documentation, particularly regarding governmental surveillance adequacy determinations.
Practical Next Steps
To put these insights into practice for conducting an ai vendor security audit, consider the following action items:
- Establish a cross-functional governance committee with clear decision-making authority and regular review cadences.
- Document your current governance processes and identify gaps against regulatory requirements in your operating markets.
- Create standardized templates for governance reviews, approval workflows, and compliance documentation.
- Schedule quarterly governance assessments to ensure your framework evolves alongside regulatory and organizational changes.
- Build internal governance capabilities through targeted training programs for stakeholders across different business functions.
Effective governance structures require deliberate investment in organizational alignment, executive accountability, and transparent reporting mechanisms. Without these foundational elements, governance frameworks remain theoretical documents rather than living operational systems.
Common Questions
AI audits must examine training data handling, model security, prompt injection defenses, and AI-specific incident response—areas not covered in traditional software security assessments.
Assess data handling practices, model security, API security, access controls, incident response, compliance certifications, and contract terms for audit rights.
Conduct initial assessment before deployment, annual reviews, and additional assessments after significant changes or incidents. Risk-based frequency for different vendors.
References
- AI Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology (NIST) (2023). View source
- Cybersecurity Framework (CSF) 2.0. National Institute of Standards and Technology (NIST) (2024). View source
- ISO/IEC 27001:2022 — Information Security Management. International Organization for Standardization (2022). View source
- ISO/IEC 42001:2023 — Artificial Intelligence Management System. International Organization for Standardization (2023). View source
- OWASP Top 10 for Large Language Model Applications 2025. OWASP Foundation (2025). View source
- Model AI Governance Framework (Second Edition). PDPC and IMDA Singapore (2020). View source
- EU AI Act — Regulatory Framework for Artificial Intelligence. European Commission (2024). View source

