AI Governance & Risk ManagementGuidePractitioner

AI Bias Audit Requirements: Testing and Documentation Standards

Q: What qualifies as an independent auditor under NYC Local Law 144?

An independent auditor cannot be an employee of the employer or AEDT vendor, cannot have participated in developing or distributing the AEDT, and cannot have financial relationships (such as shared ownership or investment) that could compromise objectivity. Suitable auditors include third-party consulting firms, academic researchers, or specialized bias audit providers engaged under a professional services agreement.

Q: Do I need a bias audit if my AI system does not use protected characteristics as inputs?

Yes, if the system falls within the scope of applicable regulations. Bias audits focus on outcomes, not just inputs. Even when protected attributes are excluded, correlated proxy variables can still produce disparate impact, creating legal and regulatory exposure.

Q: How can I run a bias audit if I do not collect demographic data?

Options include voluntary self-identification surveys, probabilistic inference methods (such as BISG) with caution, synthetic test datasets with known demographics, and benchmarking on external labeled datasets. Any collection or inference of sensitive attributes must comply with privacy laws such as GDPR, so legal review is essential.

Q: What is the difference between a bias audit and a fairness assessment?

A bias audit is typically a compliance-oriented, retrospective statistical review of system outcomes, often performed periodically by an independent party. A fairness assessment is broader and lifecycle-focused, covering data governance, model design, testing, deployment, and ongoing monitoring. Mature programs implement both.

Q: What should I do if an audit reveals discriminatory impact?

You should assess severity and legal exposure, consider pausing or limiting system use, perform root cause analysis, implement technical and procedural mitigations, re-test to confirm improvements, and document all steps. In some jurisdictions, you may also need to notify regulators or affected individuals.

January 13, 202513 min readPertama Partners

For:CTO/CIOHR DirectorHR LeaderOperations

Complete guide to mandatory bias testing requirements across NYC Local Law 144, EU AI Act, and emerging state regulations. Practical audit frameworks, documentation standards, and compliance strategies for AI systems.

Finance Compliance Review - ai governance & risk management insights

Key Takeaways

1.Mandatory bias audits are rapidly expanding from NYC Local Law 144 to the EU AI Act and new US state laws, especially for employment, credit, and housing use cases.
2.NYC emphasizes annual independent audits of AEDTs, while the EU AI Act requires continuous, lifecycle-based bias testing and post-market monitoring.
3.The four-fifths rule (80% impact ratio) remains a central benchmark for disparate impact analysis in US employment contexts.
4.Effective audits require high-quality demographic data, robust statistical methods, and clear documentation of data governance and model design.
5.Detecting bias creates an obligation to remediate through technical, procedural, and organizational interventions, followed by re-testing and monitoring.

19 min read • 40 sections

Executive Summary: Bias auditing has emerged as a cornerstone compliance requirement for AI systems, particularly in high-risk applications like hiring, lending, and housing. NYC Local Law 144 established the first mandatory bias audit framework in the US in 2023, requiring annual independent audits of automated employment decision tools (AEDTs). The EU AI Act extends this requirement to all high-risk AI systems, mandating comprehensive bias testing throughout the system lifecycle. Additional mandates are emerging in California (AB 331), Illinois (HB 3773), and at the federal level through EEOC guidance. Organizations deploying AI in regulated domains must implement robust bias testing frameworks, maintain detailed documentation, and engage independent auditors to verify fairness metrics. This guide provides practical implementation strategies for meeting bias audit requirements across jurisdictions while building systems that are genuinely fair and equitable.

What Is a Bias Audit?

Definition and Scope

AI Bias Audit is a systematic evaluation of an AI system to identify and measure disparate impact across protected demographic groups. Key components:

Statistical Analysis: Quantitative measurement of selection rates, false positive/negative rates, and other fairness metrics across race, ethnicity, sex, and other protected characteristics
Comparative Assessment: Comparison of outcomes between protected groups and baseline populations to identify statistically significant disparities
Technical Evaluation: Analysis of training data, model architecture, feature selection, and algorithmic design for sources of bias
Documentation Review: Verification that fairness considerations were integrated throughout development, testing, and deployment
Impact Assessment: Evaluation of real-world outcomes and potential harms to affected populations

Types of Bias Testing

Different regulatory frameworks require different testing approaches:

Disparate Impact Testing: Statistical analysis of selection rates across demographic groups (NYC Local Law 144, EEOC guidance)
Fairness Metrics Evaluation: Measurement of false positive rates, false negative rates, calibration, and other ML fairness metrics (EU AI Act)
Data Bias Analysis: Assessment of training data representativeness, label quality, and historical bias amplification
Intersectional Analysis: Evaluation of bias across multiple protected characteristics simultaneously (emerging best practice)

What Gets Audited: Bias audits focus on outcomes (who gets hired, approved, or selected) rather than intentions. Even systems designed with good intentions can produce discriminatory outcomes if trained on biased data or deployed without proper fairness constraints.

NYC Local Law 144: Automated Employment Decision Tools

Scope and Applicability

Who Must Comply:

Employers using AEDTs in New York City (employees or candidates residing in NYC)
Employment agencies serving NYC clients
Vendors selling AEDTs to NYC employers

What Qualifies as an AEDT:

Computational process deriving from machine learning, statistical modeling, data analytics, or AI
Used to substantially assist or replace discretionary decision-making for:
- Screening candidates for employment
- Making decisions about promotion of current employees
Exclusions: Tools that don't automate or assist decision-making (e.g., scheduling software, background check databases)

Audit Requirements

Annual Independent Audit:

Conducted by independent auditor (not affiliated with employer or vendor)
Published within one year prior to use of AEDT
Publicly available on employer's website

Required Metrics:

Selection Rate: Percentage of candidates selected for each category (race/ethnicity, sex)
Impact Ratio: Selection rate for each category divided by selection rate for most selected category
Scoring Rate: If AEDT assigns scores, similar metrics for scoring distributions
Must report for: Sex (male/female), Race/Ethnicity (Hispanic or Latino, White, Black or African American, Native Hawaiian or Pacific Islander, Asian, Native American or Alaska Native, Two or More Races)

Bias Determination:

Bias exists if impact ratio is less than 80% (four-fifths rule from EEOC's Uniform Guidelines)
Must report any category with impact ratio < 0.80

Candidate Notification Requirements

Notice Before Use:

At least 10 business days before AEDT is used
Posted on careers page or job posting
Includes:
- Statement that AEDT will be used
- Job qualifications and characteristics AEDT will assess
- Data sources (résumé, application, public data sources)

Alternative Selection Process:

Candidates can request alternative process or accommodation
Employers must provide reasonable alternative if requested

80% Threshold: The "four-fifths rule" (80% impact ratio) originated in the 1978 EEOC Uniform Guidelines and remains the standard for disparate impact analysis. Selection rates below this threshold trigger legal scrutiny but don't automatically establish discrimination.

Compliance Timeline and Penalties

Effective Dates:

Law enacted: January 1, 2023 (delayed from July 2023)
Enforcement began: April 15, 2023 (with additional delay for notice posting requirements)

Penalties:

First violation: Up to $500 per violation
Subsequent violations: Up to $1,500 per violation
Each day of continued violation constitutes separate violation
Enhanced penalties for failure to provide reasonable accommodation

EU AI Act: High-Risk Systems Bias Testing

Article 10: Data Governance

Training Data Requirements:

Training, validation, and testing datasets must be relevant, sufficiently representative, and free of errors
Must take into account characteristics or elements particular to the specific geographical, contextual, or functional setting
Bias Mitigation: Datasets must be examined for possible biases that could lead to discrimination

Documentation:

Data provenance and collection methodology
Demographic representativeness assessment
Known limitations and biases in training data
Data preprocessing and augmentation techniques
Version control and dataset updates

Article 15: Accuracy, Robustness, and Cybersecurity

Testing Requirements:

Testing must achieve appropriate level of accuracy, robustness, and cybersecurity
Must be tested against bias using disaggregated datasets by relevant demographic characteristics
Testing must continue throughout system lifecycle (not just pre-deployment)

Fairness Metrics:

Demographic parity (equal selection rates across groups)
Equalized odds (equal true positive and false positive rates)
Calibration (predicted probabilities match observed frequencies)
Individual fairness (similar individuals receive similar predictions)

Post-Market Monitoring (Article 72)

Ongoing Obligations:

Establish systematic procedures for monitoring AI system performance
Collect and analyze data on system outputs, especially regarding bias and discrimination
Take corrective action if bias is detected post-deployment
Report serious incidents to national authorities

Lifecycle Approach: The EU AI Act requires bias testing throughout the AI system lifecycle—from initial development through deployment and ongoing monitoring. This contrasts with NYC's annual audit model and represents a more comprehensive fairness assurance framework.

Emerging US State and Federal Requirements

California AB 331 (Proposed)

Automated Decision Systems:

Requires impact assessments for automated decision systems used in employment, housing, education, healthcare, legal services
Must assess foreseeable risks, including bias and discrimination
Impact assessment must be updated annually
Public disclosure of assessment summary

Illinois HB 3773 (Enacted 2024)

Artificial Intelligence Video Interview Act (Extended):

Employers using AI to analyze video interviews must:
- Explain how AI evaluates candidates
- Obtain candidate consent
- Allow candidates to request alternative evaluation method
Must ensure AI is tested for bias before use
Prohibits use if AI cannot be shown to be free from racial, ethnic, or gender-based bias

Maryland HB 283 (Enacted 2024)

Facial Recognition Services:

Prohibits use of facial recognition in employment decisions without notice and consent
Requires disclosure of how facial recognition is used and what characteristics are analyzed
Annual reporting on accuracy rates by demographic group

Federal Developments

EEOC Guidance (May 2023):

Employers liable for discrimination if use of AI produces disparate impact
Employers must validate AI tools for job-relatedness and business necessity
Regular adverse impact analysis required (similar to NYC approach)

Equal Credit Opportunity Act (ECOA):

Federal Reserve guidance requires bias testing for AI credit models
Must monitor for disparate impact in lending decisions
Adverse action notices must explain AI-driven denials

Practical Implementation: Conducting Bias Audits

Step 1: Determine Audit Scope

System Inventory:

Identify all AI systems in use that make or substantially assist decisions affecting individuals
Classify by risk level and applicable regulatory requirements
Prioritize systems with highest potential impact (hiring, credit, housing, healthcare)

Regulatory Mapping:

NYC Local Law 144: AEDTs used for employment decisions in NYC
EU AI Act: High-risk systems (Annex III categories)
State laws: Check applicability of IL, CA, MD, and other state-specific requirements
Industry-specific: ECOA for lending, FCRA for background checks

Step 2: Select Independent Auditor

Qualifications:

Technical expertise in machine learning and fairness metrics
Understanding of applicable legal standards (disparate impact, equal protection)
Independence from system developers and deployers
Experience with similar systems in relevant domain

Engagement Structure:

Clear scope of work defining testing methodology
Access to necessary data, documentation, and system components
Confidentiality and data protection agreements
Deliverables: written audit report, remediation recommendations

Step 3: Prepare Data and Documentation

Required Data:

NYC Local Law 144: Selection rates by race/ethnicity and sex for past year
EU AI Act: Training data demographics, testing datasets, system outputs by demographic group
Historical Data: At least 1 year of deployment data (more is better for statistical significance)

Documentation:

System design documentation
Training data provenance and quality assessment
Feature engineering decisions
Model selection and hyperparameter tuning
Previous fairness testing results
Known limitations and edge cases

Step 4: Statistical Analysis

Disparate Impact Testing (NYC approach):

Calculate selection rate for each demographic group (selected / applied)
Identify most selected group
Calculate impact ratio: (Group selection rate) / (Most selected group rate)
Identify groups with impact ratio < 0.80 (80% threshold)

Fairness Metrics (EU approach):

Demographic Parity: P(Ŷ=1 | A=a) equal across all demographic groups
Equalized Odds: P(Ŷ=1 | Y=1, A=a) and P(Ŷ=0 | Y=0, A=a) equal across groups
Calibration: P(Y=1 | Ŷ=p, A=a) = p for all groups
Individual Fairness: Similar candidates receive similar predictions

Statistical Significance:

Use appropriate statistical tests (chi-square, Fisher's exact test) to determine if observed disparities are statistically significant
Account for multiple hypothesis testing (Bonferroni correction)
Report confidence intervals, not just point estimates

Step 5: Root Cause Analysis

If bias is detected:

Data Bias:

Underrepresentation of certain groups in training data
Historical bias in labels (past hiring decisions reflected discriminatory practices)
Measurement bias (proxy variables correlated with protected characteristics)

Algorithmic Bias:

Feature selection amplifying group differences
Model architecture encoding stereotypes
Optimization objective failing to account for fairness constraints

Deployment Bias:

Threshold selection favoring certain groups
Integration with human decision-making introducing bias
Feedback loops reinforcing initial disparities

Step 6: Remediation and Mitigation

Technical Interventions:

Pre-processing: Reweighing training data, sampling techniques to balance representation
In-processing: Fairness-aware learning algorithms, constrained optimization
Post-processing: Threshold optimization, score recalibration by group

Procedural Interventions:

Human-in-the-loop review for borderline cases
Appeals process for candidates to contest automated decisions
Periodic retraining with updated data

Organizational Interventions:

Diverse AI development teams
Fairness review boards for high-risk systems
Regular fairness training for system developers and deployers

Step 7: Documentation and Reporting

Audit Report Contents (NYC requirement):

Selection rates and impact ratios for all demographic categories
Methodology and data sources
Auditor qualifications and independence statement
Date of audit and period covered
Limitations and caveats

Public Disclosure:

NYC: Publish audit report on employer website
EU: Make technical documentation available to authorities on request
CA (proposed): Publish impact assessment summary

Internal Documentation:

Detailed statistical analysis and raw data
Root cause analysis of identified biases
Remediation plan and timeline
Ongoing monitoring procedures

Statistical vs. Practical Significance: A bias audit may reveal statistically significant disparities that are too small to warrant remediation, or practically significant impacts that fail statistical tests due to sample size. Use both statistical rigor and ethical judgment when interpreting results.

Advanced Topics

Intersectional Bias

Challenge: Traditional bias audits examine protected characteristics independently (race OR sex), missing disparities affecting intersectional groups (Black women, Latino men).

Solution:

Test for bias across combinations of protected characteristics
Use sample size permitting, calculate metrics for intersectional groups
Qualitative analysis when quantitative analysis is underpowered

Proxy Variables

Challenge: Removing protected characteristics from training data doesn't eliminate bias if proxy variables (ZIP code, college attended) remain correlated with race, sex, or other protected traits.

Detection:

Calculate correlation between features and protected characteristics
Use techniques like LIME, SHAP to identify feature importance
Test for disparate impact even when protected characteristics aren't used

Mitigation:

Remove high-correlation proxies from model
Use fairness-aware feature selection methods
Apply constraints on proxy variable influence

Temporal Bias

Challenge: Bias audits are point-in-time assessments, but AI systems evolve as data distributions shift and models are retrained.

Solution:

Continuous monitoring of fairness metrics post-deployment
A/B testing fairness impact of model updates
Version control for models with fairness metadata
Automated alerts when fairness metrics degrade

Compliance Checklist

Preparation:

Inventory all AI systems used in regulated domains (employment, credit, housing)
Determine applicable regulatory requirements (NYC, EU, state laws)
Collect historical data on system outputs by demographic group (minimum 1 year)
Compile technical documentation (architecture, training data, features)

Engagement:

Select qualified independent auditor
Define audit scope, methodology, and deliverables
Execute data sharing and confidentiality agreements
Provide auditor with necessary access and documentation

Testing:

Calculate selection rates and impact ratios by race/ethnicity and sex
Compute additional fairness metrics (equalized odds, calibration) as required
Perform statistical significance testing
Conduct intersectional analysis where sample size permits

Remediation:

Identify root causes of detected biases
Implement technical and procedural mitigations
Re-test system to verify bias reduction
Document remediation efforts

Disclosure:

Publish audit report on public website (NYC requirement)
Post candidate notice 10+ days before AEDT use (NYC)
Maintain internal records for regulatory review
Update audit annually or upon significant system changes

Ongoing:

Monitor fairness metrics continuously post-deployment
Establish processes for candidate requests and accommodations
Train staff on bias awareness and system limitations
Report serious incidents to authorities (EU requirement)

Frequently Asked Questions

What qualifies as an "independent" auditor under NYC Local Law 144?

Independent means the auditor is not affiliated with the employer or AEDT vendor in a way that could compromise objectivity. Specifically, the auditor cannot be:

An employee of the employer or vendor
Involved in developing or distributing the AEDT being audited
In a financial relationship (investment, shared ownership) with employer or vendor

Acceptable auditors include third-party consulting firms, academic researchers, or specialized bias audit companies. The auditor can have a client relationship (being paid to conduct the audit) without violating independence requirements.

Do I need a bias audit if my AI system doesn't use protected characteristics as inputs?

Yes, if your system falls within the scope of applicable regulations. Removing race, sex, or other protected characteristics from training data doesn't eliminate bias—proxy variables (ZIP code, school attended, names) often correlate with protected traits and can produce discriminatory outcomes. Bias audits evaluate outputs (who gets hired, approved, or selected), not just inputs. If your system produces disparate impact in outcomes, you're liable regardless of whether protected characteristics were used as features.

How do I audit an AI system when I don't have demographic data on candidates?

This is a common challenge, especially in Europe where GDPR restricts collection of sensitive personal data. Options:

Self-identification surveys: Request voluntary demographic information separate from application process, with clear notice that data is used solely for bias testing (GDPR Article 9 exemption for statistical purposes)
Probabilistic inference: Use Bayesian Improved Surname Geocoding (BISG) or similar methods to probabilistically infer race/ethnicity from names and addresses (use caution—accuracy varies and may introduce errors)
Synthetic data testing: Generate synthetic test data with known demographic distributions and measure system behavior
External benchmarking: Test system on publicly available datasets with demographic labels (e.g., from academic research)

Consult legal counsel to ensure demographic data collection complies with GDPR or other privacy laws in your jurisdiction.

What's the difference between a bias audit and a fairness assessment?

The terms are often used interchangeably, but:

Bias audit typically refers to a compliance-driven, retrospective statistical analysis of deployed system outcomes (NYC model)
Fairness assessment is broader, encompassing proactive analysis throughout the AI lifecycle, including training data quality, algorithmic design choices, and continuous monitoring (EU model)

NYC Local Law 144 requires annual bias audits of deployed systems. The EU AI Act requires ongoing fairness assessment from design through post-market monitoring. In practice, organizations should do both: proactive fairness work during development and formal audits post-deployment.

Can I use an automated bias testing tool, or must it be manual?

You can use automated tools, but the audit must still be conducted by an independent auditor who interprets results and provides expert analysis. Automated tools are valuable for:

Calculating fairness metrics at scale
Continuous monitoring of deployed systems
Testing multiple fairness definitions simultaneously

However, automation alone is insufficient because:

Tools require configuration (selecting appropriate metrics, thresholds, demographic categories)
Statistical results require interpretation (practical vs. statistical significance)
Root cause analysis requires domain expertise
Remediation recommendations need human judgment

Use automated tools to support the audit process, but ensure an independent expert oversees analysis and reporting.

What happens if my bias audit reveals discrimination?

Legal and operational consequences depend on jurisdiction and severity:

Legal Exposure:

NYC: Potential fines up to $1,500 per violation per day
Federal: Disparate impact liability under Title VII (employment), ECOA (credit), Fair Housing Act
EU: Non-compliance with AI Act can result in fines up to €15M or 3% of annual turnover

Required Actions:

Immediate: Stop using the system if bias is severe and cannot be quickly mitigated
Short-term: Implement technical fixes (threshold adjustment, retraining with fairness constraints)
Long-term: Root cause analysis and architectural changes if needed
Disclosure: Some jurisdictions may require disclosure to affected individuals or regulators

Proactive Steps:

Conduct bias audit before deployment (not just annually)
Implement fairness monitoring to detect issues early
Maintain documentation showing good-faith efforts to address bias
Consult legal counsel when bias is detected to manage liability

How should I handle small sample sizes in bias audits?

Small samples make statistical analysis challenging because confidence intervals are wide and disparities may not reach significance. Strategies:

Statistical Approaches:

Use exact tests (Fisher's exact test) instead of asymptotic tests (chi-square) for small samples
Report confidence intervals, not just point estimates, to reflect uncertainty
Consider Bayesian methods that incorporate prior information
Pool data across multiple time periods if appropriate (e.g., 2-3 years instead of 1 year)

Practical Approaches:

Lower thresholds for concern when samples are small (don't require statistical significance to trigger investigation)
Focus on groups with sufficient sample size; acknowledge limitations for small groups
Use qualitative analysis (case reviews) to supplement statistical testing
Test on synthetic data to understand system behavior when real-world samples are limited

Regulatory Perspective:

NYC rules don't specify minimum sample size, but DCWP guidance suggests 500+ for reliable metrics
Document sample size limitations in audit report
If sample is too small for meaningful analysis, consider whether system should be used at all

Key Takeaways

Mandatory Bias Audits Are Expanding: NYC Local Law 144 pioneered mandatory employment AI audits in 2023. The EU AI Act extends bias testing to all high-risk systems. California, Illinois, Maryland, and other states are enacting similar requirements. Organizations using AI in regulated domains must prepare for comprehensive bias auditing.
Annual + Continuous Monitoring: NYC requires annual audits; the EU requires lifecycle monitoring. Best practice: combine periodic formal audits with continuous automated monitoring to detect bias early and demonstrate ongoing diligence.
Independence Is Critical: Use qualified independent auditors who have no financial or organizational ties to the AI system vendor or deployer. Internal testing is valuable but doesn't satisfy regulatory requirements.
80% Threshold (Four-Fifths Rule): The EEOC's 80% impact ratio remains the standard for disparate impact analysis in US employment. Selection rates below this threshold trigger legal scrutiny under NYC law and federal guidance.
Data Collection Challenges: Bias audits require demographic data, but privacy laws (especially GDPR) restrict collection. Use voluntary self-identification surveys, probabilistic inference methods, or synthetic data testing—consult legal counsel on compliant approaches.
Proxy Variables Matter: Removing protected characteristics from AI inputs doesn't eliminate bias. Proxy variables (ZIP code, school, names) correlate with race and sex, producing disparate impact. Bias audits must examine outcomes, not just inputs.
Remediation Is Required: Detecting bias creates an obligation to remediate. Technical fixes include retraining with fairness constraints, threshold optimization, and architectural changes. Procedural fixes include human review and appeals processes. Document all remediation efforts.

Citations

NYC Department of Consumer and Worker Protection (DCWP). (2023). Automated Employment Decision Tools (Local Law 144). https://www.nyc.gov/site/dca/about/automated-employment-decision-tools.page
European Commission. (2024). Regulation (EU) 2024/1689 on Artificial Intelligence (AI Act). https://artificialintelligenceact.eu/
U.S. Equal Employment Opportunity Commission (EEOC). (2023). The Americans with Disabilities Act and the Use of Software, Algorithms, and Artificial Intelligence to Assess Job Applicants and Employees. https://www.eeoc.gov/laws/guidance/americans-disabilities-act-and-use-software-algorithms-and-artificial-intelligence
Barocas, S., & Selbst, A. D. (2016). Big Data's Disparate Impact. California Law Review, 104(3), 671-732. https://www.californialawreview.org/print/big-datas-disparate-impact/
AI Now Institute. (2023). Algorithmic Auditing: A Practical Guide. https://ainowinstitute.org/

Need help conducting a bias audit for your AI systems? Our compliance team provides independent bias audits, fairness assessments, and remediation strategies to help you meet regulatory requirements and build genuinely fair AI.

Frequently Asked Questions

An independent auditor cannot be an employee of the employer or AEDT vendor, cannot have participated in developing or distributing the AEDT, and cannot have financial relationships (such as shared ownership or investment) that could compromise objectivity. Suitable auditors include third-party consulting firms, academic researchers, or specialized bias audit providers engaged under a professional services agreement.

Yes, if the system falls within the scope of applicable regulations. Bias audits focus on outcomes, not just inputs. Even when protected attributes are excluded, correlated proxy variables can still produce disparate impact, creating legal and regulatory exposure.

Options include voluntary self-identification surveys, probabilistic inference methods (such as BISG) with caution, synthetic test datasets with known demographics, and benchmarking on external labeled datasets. Any collection or inference of sensitive attributes must comply with privacy laws such as GDPR, so legal review is essential.

A bias audit is typically a compliance-oriented, retrospective statistical review of system outcomes, often performed periodically by an independent party. A fairness assessment is broader and lifecycle-focused, covering data governance, model design, testing, deployment, and ongoing monitoring. Mature programs implement both.

You should assess severity and legal exposure, consider pausing or limiting system use, perform root cause analysis, implement technical and procedural mitigations, re-test to confirm improvements, and document all steps. In some jurisdictions, you may also need to notify regulators or affected individuals.

What Gets Audited in Practice

Regulators and independent auditors focus on measurable outcomes: who is selected, rejected, promoted, or denied. Intent, model architecture, and feature choices matter, but they do not override evidence of disparate impact in real-world decisions.

Mind the Gap Between Statistical and Practical Significance

A disparity can be statistically significant yet operationally trivial, or practically harmful yet not statistically significant due to small sample sizes. Governance processes should require both quantitative analysis and qualitative judgment before deciding on remediation.

80%

Impact ratio threshold used in the four-fifths rule for disparate impact

Source: EEOC Uniform Guidelines on Employee Selection Procedures (1978)

"The EU AI Act shifts organizations from one-off, point-in-time bias checks to continuous, lifecycle-based fairness assurance for high-risk AI systems."
— AI Governance & Risk Management Practice

References

Automated Employment Decision Tools (Local Law 144). NYC Department of Consumer and Worker Protection (2023). View source
Regulation (EU) 2024/1689 on Artificial Intelligence (AI Act). European Commission (2024). View source
The Americans with Disabilities Act and the Use of Software, Algorithms, and Artificial Intelligence to Assess Job Applicants and Employees. U.S. Equal Employment Opportunity Commission (2023). View source
Big Data's Disparate Impact. California Law Review (2016). View source
Algorithmic Auditing: A Practical Guide. AI Now Institute (2023). View source

AI Bias Audit Requirements: Testing and Documentation Standards

Key Takeaways

What Is a Bias Audit?

Definition and Scope

Types of Bias Testing

NYC Local Law 144: Automated Employment Decision Tools

Scope and Applicability

Audit Requirements

Candidate Notification Requirements

Compliance Timeline and Penalties

EU AI Act: High-Risk Systems Bias Testing

Article 10: Data Governance

Article 15: Accuracy, Robustness, and Cybersecurity

Post-Market Monitoring (Article 72)

Emerging US State and Federal Requirements

California AB 331 (Proposed)

Illinois HB 3773 (Enacted 2024)

Maryland HB 283 (Enacted 2024)

Federal Developments

Practical Implementation: Conducting Bias Audits

Step 1: Determine Audit Scope

Step 2: Select Independent Auditor

Step 3: Prepare Data and Documentation

Step 4: Statistical Analysis

Step 5: Root Cause Analysis

Step 6: Remediation and Mitigation

Step 7: Documentation and Reporting

Advanced Topics

Intersectional Bias

Proxy Variables

Temporal Bias

Compliance Checklist

Frequently Asked Questions

What qualifies as an "independent" auditor under NYC Local Law 144?

Do I need a bias audit if my AI system doesn't use protected characteristics as inputs?

How do I audit an AI system when I don't have demographic data on candidates?

What's the difference between a bias audit and a fairness assessment?

Can I use an automated bias testing tool, or must it be manual?

What happens if my bias audit reveals discrimination?

How should I handle small sample sizes in bias audits?

Key Takeaways

Citations

Frequently Asked Questions

What qualifies as an independent auditor under NYC Local Law 144?

Do I need a bias audit if my AI system does not use protected characteristics as inputs?

How can I run a bias audit if I do not collect demographic data?

What is the difference between a bias audit and a fairness assessment?

What should I do if an audit reveals discriminatory impact?

What Gets Audited in Practice

Mind the Gap Between Statistical and Practical Significance

References

How Pertama Partners Can Help

AI Governance & Security

AI Fraud Detection & Risk Management for Financial Services

AI Family Business Operations & Governance

Explore Further

Ready to Apply These Insights to Your Organization?

Related Articles