Back to ChatGPT Training for Work

Prevent Data Leakage with ChatGPT at Work β€” A Complete Guide

Pertama PartnersFebruary 11, 20268 min read
πŸ‡²πŸ‡Ύ MalaysiaπŸ‡ΈπŸ‡¬ Singapore
Prevent Data Leakage with ChatGPT at Work β€” A Complete Guide

The Data Leakage Risk with ChatGPT

When employees use ChatGPT at work, every prompt they type potentially shares company data with an external service. While enterprise AI plans have stronger data protections, the risk of data leakage is real β€” and one careless prompt can expose customer information, trade secrets, or confidential business data.

This guide explains the specific risks and practical steps to prevent data leakage.

How Data Leakage Happens

Scenario 1: Direct Input of Sensitive Data

An employee pastes a customer complaint email (including the customer's name, account number, and order details) into ChatGPT to draft a response. The customer's personal data is now processed by an external service.

Scenario 2: Contextual Accumulation

Over multiple prompts, an employee shares enough context about a confidential project β€” team names, financial targets, strategic plans β€” that the accumulated information constitutes a confidential briefing.

Scenario 3: Code and Intellectual Property

A developer pastes proprietary source code into ChatGPT for debugging help. The code may contain algorithms, API keys, or business logic that constitutes trade secrets.

Scenario 4: Training Data Concerns

With consumer-tier AI products, user prompts may be used to improve the model. This means sensitive data could theoretically influence future outputs visible to other users. (Enterprise plans typically exclude data from training.)

Data Classification Framework

The first defence against data leakage is a clear data classification system. Every piece of information in your company falls into one of these categories:

Green β€” Public Data

Information that is already publicly available or intended for public distribution.

  • Published press releases, marketing materials
  • Job listings, company website content
  • Industry statistics and public data
  • General business knowledge

AI Rule: Can be freely used with any AI tool.

Yellow β€” Internal Data

Information that is not confidential but is meant for internal use only.

  • Internal process documents, SOPs
  • Meeting agendas and non-sensitive notes
  • General project updates (non-strategic)
  • Team communications

AI Rule: May be used with approved enterprise AI tools only (not free-tier consumer products).

Orange β€” Confidential Data

Information that could harm the company or individuals if disclosed.

  • Financial results (before public release)
  • Strategic plans and competitive intelligence
  • Employee performance data
  • Customer lists and contact databases
  • Pricing strategies

AI Rule: Must be anonymised before use. Remove all identifying details (names, numbers, dates). Use only with approved enterprise AI tools.

Red β€” Restricted Data

Information that must never enter any external AI system.

  • Personal identifiable information (PII): NRIC, IC, passport numbers
  • Financial data: bank accounts, credit cards, salary details
  • Medical records and health information
  • Legal privileged communications
  • API keys, passwords, access credentials
  • Source code containing proprietary algorithms

AI Rule: NEVER enter into any AI tool, under any circumstances.

Practical Safeguards

1. Use Enterprise Plans Only

Consumer-tier AI products (free ChatGPT, free Claude) have different data handling practices than enterprise plans. Key differences:

FeatureConsumer/FreeEnterprise
Data used for trainingOften yesTypically no
Data retentionExtendedLimited/configurable
Admin controlsNoneFull
Usage monitoringNoneAudit logs
Data processing agreementNoneAvailable
Compliance certificationsLimitedSOC 2, ISO 27001

2. Implement Technical Controls

  • Block consumer AI websites on corporate networks (allow only enterprise endpoints)
  • Enable data loss prevention (DLP) tools that flag sensitive data in AI prompts
  • Configure AI tool admin settings to restrict data sharing
  • Enable audit logging for all AI tool usage

3. Train Every Employee

Every employee who uses AI tools must understand:

  • The data classification framework (Green/Yellow/Orange/Red)
  • How to anonymise data before using it with AI
  • Which AI tools are approved (and which are blocked)
  • What to do if they accidentally share sensitive data

4. Create an Anonymisation Checklist

Before pasting any text into an AI tool, check for and remove:

  • Personal names β†’ Replace with [Person A], [Employee B]
  • Company names β†’ Replace with [Company X]
  • Account/ID numbers β†’ Remove entirely
  • Contact details (email, phone, address) β†’ Remove
  • Financial figures β†’ Replace with approximations
  • Dates that could identify events β†’ Generalise
  • Location details that narrow identification β†’ Generalise

5. Establish Incident Response

When data leakage occurs (or is suspected):

  1. Stop using the AI tool immediately for that session
  2. Document what data was shared (screenshot if possible)
  3. Report to IT Security within 1 hour
  4. IT assesses the severity and determines response steps
  5. Notify affected parties if PII was involved (PDPA requirement)
  6. Update safeguards to prevent recurrence

Regulatory Context

Singapore PDPA

The Personal Data Protection Act requires organisations to protect personal data and obtain consent for its use. Inputting personal data into AI tools without proper safeguards may constitute a breach. Penalties can reach S$1 million per breach.

Malaysia PDPA

Malaysia's Personal Data Protection Act similarly requires organisations to safeguard personal data. Sharing personal data with AI services may violate data processing principles if proper consent and safeguards are not in place.

What Good Looks Like

A company with effective AI data protection:

  • Has a written AI usage policy that all employees have read and signed
  • Uses only enterprise-tier AI tools with data processing agreements
  • Trains every employee on data classification and anonymisation
  • Monitors AI tool usage through admin dashboards and audit logs
  • Responds to incidents within 1 hour with a defined process
  • Reviews and updates its AI policy quarterly

Related Reading

Frequently Asked Questions

Yes, if employees input sensitive information into AI tools. The risks include: direct input of personal data, accumulation of confidential context across prompts, and exposure of intellectual property. Enterprise AI plans provide stronger protections, but employee training and data classification are essential safeguards.

ChatGPT Enterprise is significantly safer than consumer/free versions. Data is not used for model training, retention is configurable, admin controls are available, and SOC 2 compliance is maintained. However, even with Enterprise, employees must follow data classification guidelines β€” do not input restricted data (PII, credentials, source code).

Immediately stop the session, document what was shared, and report to IT Security within 1 hour. If personal data was involved, assess PDPA notification requirements. Then update safeguards to prevent recurrence β€” this may include additional training, technical controls, or policy updates.

More on ChatGPT Training for Work