The Data Leakage Risk with ChatGPT
When employees use ChatGPT at work, every prompt they type potentially shares company data with an external service. While enterprise AI plans have stronger data protections, the risk of data leakage is real — and one careless prompt can expose customer information, trade secrets, or confidential business data.
This guide explains the specific risks and practical steps to prevent data leakage.
How Data Leakage Happens
Scenario 1: Direct Input of Sensitive Data
An employee pastes a customer complaint email (including the customer's name, account number, and order details) into ChatGPT to draft a response. The customer's personal data is now processed by an external service.
Scenario 2: Contextual Accumulation
Over multiple prompts, an employee shares enough context about a confidential project — team names, financial targets, strategic plans — that the accumulated information constitutes a confidential briefing.
Scenario 3: Code and Intellectual Property
A developer pastes proprietary source code into ChatGPT for debugging help. The code may contain algorithms, API keys, or business logic that constitutes trade secrets.
Scenario 4: Training Data Concerns
With consumer-tier AI products, user prompts may be used to improve the model. This means sensitive data could theoretically influence future outputs visible to other users. (Enterprise plans typically exclude data from training.)
Data Classification Framework
The first defence against data leakage is a clear data classification system. Every piece of information in your company falls into one of these categories:
Green — Public Data
Information that is already publicly available or intended for public distribution.
- Published press releases, marketing materials
- Job listings, company website content
- Industry statistics and public data
- General business knowledge
AI Rule: Can be freely used with any AI tool.
Yellow — Internal Data
Information that is not confidential but is meant for internal use only.
- Internal process documents, SOPs
- Meeting agendas and non-sensitive notes
- General project updates (non-strategic)
- Team communications
AI Rule: May be used with approved enterprise AI tools only (not free-tier consumer products).
Orange — Confidential Data
Information that could harm the company or individuals if disclosed.
- Financial results (before public release)
- Strategic plans and competitive intelligence
- Employee performance data
- Customer lists and contact databases
- Pricing strategies
AI Rule: Must be anonymised before use. Remove all identifying details (names, numbers, dates). Use only with approved enterprise AI tools.
Red — Restricted Data
Information that must never enter any external AI system.
- Personal identifiable information (PII): NRIC, IC, passport numbers
- Financial data: bank accounts, credit cards, salary details
- Medical records and health information
- Legal privileged communications
- API keys, passwords, access credentials
- Source code containing proprietary algorithms
AI Rule: NEVER enter into any AI tool, under any circumstances.
Practical Safeguards
1. Use Enterprise Plans Only
Consumer-tier AI products (free ChatGPT, free Claude) have different data handling practices than enterprise plans. Key differences:
| Feature | Consumer/Free | Enterprise |
|---|---|---|
| Data used for training | Often yes | Typically no |
| Data retention | Extended | Limited/configurable |
| Admin controls | None | Full |
| Usage monitoring | None | Audit logs |
| Data processing agreement | None | Available |
| Compliance certifications | Limited | SOC 2, ISO 27001 |
2. Implement Technical Controls
- Block consumer AI websites on corporate networks (allow only enterprise endpoints)
- Enable data loss prevention (DLP) tools that flag sensitive data in AI prompts
- Configure AI tool admin settings to restrict data sharing
- Enable audit logging for all AI tool usage
3. Train Every Employee
Every employee who uses AI tools must understand:
- The data classification framework (Green/Yellow/Orange/Red)
- How to anonymise data before using it with AI
- Which AI tools are approved (and which are blocked)
- What to do if they accidentally share sensitive data
4. Create an Anonymisation Checklist
Before pasting any text into an AI tool, check for and remove:
- Personal names → Replace with [Person A], [Employee B]
- Company names → Replace with [Company X]
- Account/ID numbers → Remove entirely
- Contact details (email, phone, address) → Remove
- Financial figures → Replace with approximations
- Dates that could identify events → Generalise
- Location details that narrow identification → Generalise
5. Establish Incident Response
When data leakage occurs (or is suspected):
- Stop using the AI tool immediately for that session
- Document what data was shared (screenshot if possible)
- Report to IT Security within 1 hour
- IT assesses the severity and determines response steps
- Notify affected parties if PII was involved (PDPA requirement)
- Update safeguards to prevent recurrence
Regulatory Context
Singapore PDPA
The Personal Data Protection Act requires organisations to protect personal data and obtain consent for its use. Inputting personal data into AI tools without proper safeguards may constitute a breach. Penalties can reach S$1 million per breach.
Malaysia PDPA
Malaysia's Personal Data Protection Act similarly requires organisations to safeguard personal data. Sharing personal data with AI services may violate data processing principles if proper consent and safeguards are not in place.
What Good Looks Like
A company with effective AI data protection:
- Has a written AI usage policy that all employees have read and signed
- Uses only enterprise-tier AI tools with data processing agreements
- Trains every employee on data classification and anonymisation
- Monitors AI tool usage through admin dashboards and audit logs
- Responds to incidents within 1 hour with a defined process
- Reviews and updates its AI policy quarterly
Related Reading
- ChatGPT Company Policy — Build a comprehensive ChatGPT usage policy
- AI Risk Assessment Template — Identify and mitigate risks from AI use in your organisation
- Copilot Governance & Access — Enterprise-grade governance for Microsoft Copilot
Frequently Asked Questions
Yes, if employees input sensitive information into AI tools. The risks include: direct input of personal data, accumulation of confidential context across prompts, and exposure of intellectual property. Enterprise AI plans provide stronger protections, but employee training and data classification are essential safeguards.
ChatGPT Enterprise is significantly safer than consumer/free versions. Data is not used for model training, retention is configurable, admin controls are available, and SOC 2 compliance is maintained. However, even with Enterprise, employees must follow data classification guidelines — do not input restricted data (PII, credentials, source code).
Immediately stop the session, document what was shared, and report to IT Security within 1 hour. If personal data was involved, assess PDPA notification requirements. Then update safeguards to prevent recurrence — this may include additional training, technical controls, or policy updates.
