What is Proxy Discrimination?
Proxy Discrimination is a form of AI bias where an algorithm produces discriminatory outcomes against protected groups by using seemingly neutral data features that are strongly correlated with characteristics such as race, gender, age, or religion, even when those protected characteristics are not directly included in the model.
What is Proxy Discrimination?
Proxy Discrimination occurs when an AI system discriminates against particular groups of people not by directly using protected characteristics like race, gender, or religion, but by relying on other data features that are closely correlated with those characteristics. The AI system may never see a person's ethnicity, but if it uses their postcode, and that postcode strongly correlates with ethnicity, the outcome can be just as discriminatory.
For business leaders, this is one of the most important and least understood risks in AI deployment. Many organisations believe they have addressed discrimination by removing protected characteristics from their AI training data. Proxy discrimination reveals why this approach is insufficient.
How Proxy Discrimination Works
Consider a lending AI that is designed to predict creditworthiness. The organisation removes race and ethnicity from the training data, believing this prevents racial discrimination. However, the model still uses features like residential postcode, type of phone, and employment history. In many countries, including those in Southeast Asia, these features are strongly correlated with ethnicity and socioeconomic status due to historical patterns of segregation and inequality.
The AI system learns these correlations from the training data. It discovers that applicants from certain postcodes have historically had higher default rates, not because of any inherent characteristic of those individuals, but because of systemic factors like limited access to financial services, education, and employment opportunities. The model then penalises applicants from those areas, effectively discriminating by ethnicity without ever seeing ethnicity data.
Common proxy variables include:
- Geographic location: Postcodes and addresses often correlate strongly with race, ethnicity, and socioeconomic status.
- Language preferences: Language choice in multilingual societies can serve as a proxy for ethnic background.
- Educational institutions: The school or university attended may correlate with socioeconomic background and ethnicity.
- Employment history: Types of employment, employer names, and career patterns can reflect systemic inequalities.
- Digital behaviour: Device types, app usage patterns, and online activity can correlate with age, income, and demographic characteristics.
Why Proxy Discrimination is Particularly Dangerous
Proxy discrimination is insidious because it is easy to miss. The AI system's developers may genuinely believe they have built a fair system because they removed direct references to protected characteristics. Traditional fairness checks that only look for the presence of protected attributes in the model will not catch proxy effects.
The harm is real and measurable. Proxy discrimination can result in:
- Unfair denial of services: Qualified individuals from disadvantaged groups being denied credit, insurance, employment, or other opportunities.
- Reinforcement of inequality: AI systems that perpetuate historical patterns of discrimination rather than correcting them.
- Legal liability: Regulators and courts in many jurisdictions recognise disparate impact, where a neutral practice disproportionately affects a protected group, as a form of discrimination regardless of intent.
Proxy Discrimination in Southeast Asia
Southeast Asia's diverse societies create numerous opportunities for proxy discrimination. The region's ethnic, linguistic, and religious diversity means that many commonly used data features can serve as proxies for protected characteristics.
In Malaysia, for example, names and residential areas can be strong proxies for ethnic background. In Indonesia, language preferences and geographic location may correlate with religious affiliation. In Singapore, despite its strong multiracial policies, residential patterns and educational background can still serve as proxy variables.
Regulators across the region are beginning to pay attention. Singapore's Model AI Governance Framework emphasises the importance of testing for bias, including indirect discrimination. The ASEAN Guide on AI Governance and Ethics calls for fairness assessments that go beyond surface-level attribute removal.
For businesses operating across multiple ASEAN markets, the risk of proxy discrimination is compounded by the need to understand local demographic patterns in each market. A feature that is a harmless predictor in one country may be a powerful proxy for a protected characteristic in another.
Detecting and Preventing Proxy Discrimination
1. Correlation Analysis
Systematically analyse the correlation between your model's input features and protected characteristics. Identify features that are strong proxies and assess whether their predictive value justifies the discrimination risk.
2. Disparate Impact Testing
Test your model's outcomes across demographic groups, even if the model does not use demographic data directly. If outcomes differ significantly across groups, proxy discrimination may be present.
3. Feature Importance Review
Examine which features your model relies on most heavily. If high-importance features are strongly correlated with protected characteristics, this warrants closer investigation.
4. Counterfactual Testing
Test how the model's decisions would change if an individual belonged to a different demographic group while all other characteristics remained the same. Significant differences suggest proxy effects.
5. Ongoing Monitoring
Proxy discrimination can emerge over time as data patterns shift. Implement continuous monitoring of model outcomes across demographic groups to catch proxy effects that develop after deployment. Establish automated alerts that flag statistically significant disparities in outcomes across demographic segments, and assign responsibility for investigating and resolving these alerts promptly.
Proxy Discrimination represents one of the most significant and underappreciated risks in AI deployment. It can cause real harm to individuals, expose your organisation to legal liability, and damage your reputation, all while your team believes the system is fair because they removed protected characteristics from the data.
For businesses in Southeast Asia, the risk is amplified by the region's demographic diversity. The same AI system may exhibit different proxy effects in different markets, making centralised fairness testing insufficient. Each market requires its own analysis of which features may serve as proxies for protected characteristics.
The financial implications are serious. Regulatory fines for discriminatory practices, lawsuits from affected individuals, and the cost of rebuilding systems found to be discriminatory can far exceed the investment required to detect and prevent proxy discrimination proactively. Beyond direct costs, discriminatory AI systems exclude potential customers and employees, limiting your market reach and talent pool.
- Removing protected characteristics from AI training data is necessary but not sufficient to prevent discrimination. You must also test for proxy effects.
- Conduct correlation analysis between model input features and protected characteristics in each market you operate in, as proxy relationships vary across Southeast Asian countries.
- Implement disparate impact testing that evaluates model outcomes across demographic groups regardless of whether demographic data is used as a model input.
- Review your model's most influential features regularly to check whether they serve as proxies for protected characteristics.
- Engage local expertise in each ASEAN market to understand which data features may correlate with protected characteristics in that specific cultural and demographic context.
- Build proxy discrimination testing into your AI development lifecycle rather than treating it as a post-deployment audit.
Frequently Asked Questions
If we remove race and gender from our AI model, is that enough to prevent discrimination?
No, and this is one of the most common misconceptions in AI fairness. Removing protected characteristics from the model inputs does not prevent discrimination because other features in the data may be strongly correlated with those characteristics. Postcodes, language preferences, educational background, and many other seemingly neutral variables can serve as proxies for race, gender, or other protected attributes. You need to actively test for discriminatory outcomes across demographic groups, not just verify the absence of protected attributes in your model.
How do we detect proxy discrimination in practice?
Detection requires a combination of approaches. First, conduct correlation analysis to identify which of your model's input features are correlated with protected characteristics. Second, perform disparate impact testing by evaluating your model's outcomes across demographic groups. Third, use feature importance analysis to understand which variables most influence the model's decisions. Fourth, apply counterfactual testing to see if decisions change when demographic characteristics are hypothetically altered. These tests should be performed during development and repeated regularly after deployment.
More Questions
While no ASEAN country has AI-specific anti-discrimination legislation addressing proxy effects directly, several countries have broader anti-discrimination protections that could apply. Singapore's constitution prohibits discrimination on grounds of race, religion, and origin. Malaysia and Indonesia have constitutional equality provisions. The concept of disparate impact, where a neutral practice disproportionately affects a protected group, is gaining recognition in the region. As AI governance frameworks mature, explicit prohibitions on proxy discrimination are likely to emerge.
Need help implementing Proxy Discrimination?
Pertama Partners helps businesses across Southeast Asia adopt AI strategically. Let's discuss how proxy discrimination fits into your AI roadmap.