Bias in the Machine


Organizations that depend on artificial intelligence models must control for factors that could expose them to discrimination risk.

Can artificial intelligence (AI) discriminate? That is what Facebook’s AI is accused of doing. In March, the U.S. Department of Housing and Urban Development (HUD) announced it was suing the social media company for violating the Fair Housing Act. HUD alleges that Facebook’s advertising system allowed advertisers to limit housing ads based on race, gender, and other characteristics. The agency also claims Facebook’s ad system discriminates against users even when advertisers did not choose to do so.

Although it has yet to be proven whether Facebook committed any deliberate discrimination, the result is still the same. “Using a computer to limit a person’s housing choices can be just as discriminatory as slamming a door in someone’s face,” HUD Secretary Ben Carson said in announcing the lawsuit.

Each day, machine learning and AI (ML/AI) models make decisions that affect the lives of millions of people. As these models become more integrated with everyday decision-making, organizations need to be increasingly vigilant of the risk created by potentially discriminatory algorithms.

But who within those organizations is responsible for ensuring the ML/AI model is making fair, unbiased decisions? The model developer should not be responsible, because internal control principles dictate that the persons who create a system cannot be impartial evaluators of that same system. The model’s users also should not be responsible, because they typically lack the expertise to evaluate an ML/AI model. Users also may not question a model that seems to be performing well. For example, if a predictive policing model leads to more arrests and less crime, users are not likely to question whether that system unfairly targets a particular group. 

Internal audit may be best suited to provide assurance to the board and senior management that the organization is mitigating the reputational, financial, and legal risks of implementing a biased ML/AI model. However, because this is a new assurance domain for the profession, auditors need a methodology for auditing the fairness of these models.

Why Models Need to Be Fair

An ML/AI model is a mathematical equation that uses data to produce a calculation such as a score, ranking, classification, or prediction. It is a specific set of instructions on how to analyze data to deliver a particular result — behavior, decision, action, or cause — to support a business process. 

There are three main categories of analytic models. Descriptive models summarize large amounts of data into small bits of information that are easier for organizations to analyze and work with. Predictive models are more complex models used to identify patterns and correlations in data that can be used to predict future results. Prescriptive models enable data analysts to see how a decision today can create multiple future scenarios. 

ML/AI models need to be fair and nondiscriminatory because the decisions they support can expose organizations to substantial risk if the classification criteria they use are unethical, illegal, or publicly unacceptable. Such criteria are referred to as inappropriate classification criteria (ICCs) and include race, gender, religion, sexual orientation, and age.

In assurance engagements regarding bias, internal auditors primarily will be concerned with a type of predictive model known as a classification model. This model is used to separate people into groups based on certain attributes that an organization can use to support decisions. Examples of these attributes include:

  • Identifying borrowers who are most likely to default on a loan.
  • Classifying employees as future high performers.
  • Selecting persons who are least likely to commit further crimes if granted probation.
  • Targeting consumers to receive special promotions or opportunities. In one case, the Communications Workers of America sued T-Mobile, Facebook, and a host of other companies, alleging that those companies discriminated by excluding older workers from seeing their job ads.

To provide assurance to management and the audit committee that the organization’s ML/AI model does not discriminate, auditors need to assess two things: 1) That the model does not benefit or penalize a certain classification of people; and 2) if a classification is removed from the model, it still provides useful results. 

Internal auditors can test for bias using a model fairness review methodology. This methodology comprises: 

  1. Understanding the model’s business purpose.
  2. Working with the audit client to determine and identify ICCs. In this step, auditors also may discuss possible appropriate exogenous variables (see “Controlling for Exogenous Variables” on this page). 
  3. Selecting a large sample — or the entire data set — of input data and classification results.
  4. Conducting statistical analysis of the results to determine whether distribution of ICCs is within acceptable parameters.
  5. Discussing initial results with the client.
  6. Removing ICCs and re-running the classification model. Auditors also can replace ICCs with uniform values depending on the nature of the model.
  7. Comparing distribution of ICCs before and after removal. 

Controlling for Exogenous Variables

Often, despite the best efforts to eliminate it, discrimination creeps into an organization’s analytic models through external data that has a systemic bias, thus exposing the organization to risk. Appropriate exogenous variables (AEV) are variables that provide appropriate classification criteria but have been subject to external systemic bias that has not been detected. An example of AEVs would be the credit score for individuals from minority communities or salary information for women.

Fortunately, analytic models can be used to control for this bias. For example, after controlling for gender differences in industry, occupation, education, age, job tenure, province of residence, marital status, and union status, an 8% wage gap persists between men and women in Canada, according to a February 2018 Maclean’s article. It is a relatively simple exercise to adjust the salary variable in a classification model by +8% for female subjects. 

A Bias Audit

As an example of how internal auditors can use this methodology, consider a marketing department at a credit card company that used a classification model to determine which customers should be given a discount. The data used for the model is half women and half men. Management wanted assurance that this model was not exposing the organization to potential liability by discriminating against either group.

Internal audit met with Marketing and confirmed that it used the model to select customers for preferred rates. These preferred rates are substantially lower than the rates offered to customers in general. After reviewing the information used by the model, internal audit noted these variables:

  • Customer ID (metadata — not used as a variable).
  • Surname (ICC).
  • Credit score.
  • Geography (ICC).
  • Gender (ICC).
  • Age (ICC).
  • Tenure.
  • Balance.
  • Number of products.
  • Has credit card.
  • Estimated salary.

In some cases, a variable may be an ICC for one type of model but not for another. For example, gender is an appropriate classification criterion for a clothing company promotion but not for a loan approval. Age may be appropriate in a health-care model but not in an applicant screening.

In the marketing example, internal audit analyzed the initial results of the classification model and observed that 35% of customers were classified as good candidates. However:

  • 50% of men and 20% of women were classified as good candidates.
  • 6% of customers over 50 were classified as good candidates.
  • 1% of women over 50 were classified as good candidates.

Internal audit discussed the initial classification results with the marketing department to determine whether there are business reasons for the observed result and if those reasons are valid, defensible, and nondiscriminatory to mitigate the risk of legal liability. Based on this discussion, internal audit removed the identified ICC from the input data and re-ran the classification model. 

In reporting the results to Marketing, internal audit noted the model was producing useful results. The results showed that 45% of customers were classified as good candidates, a finding with which Marketing concurred. However:

  • 50% of men and 40% of women were classified as good candidates.
  • 21% of customers over 50 were classified as good candidates.
  • 10% of women over 50 were classified as good candidates.

Internal auditors noted that the model appears to be biased against groups such as women and people over 50, which is likely the result of exogenous variables. Auditors recommended that Marketing adjust its model to compensate for these variables.

New Models, Old Risks

Although the subject of bias in analytic models may be unfamiliar to internal auditors, their risk management role in this domain is crucial. Bias introduces an unacceptable risk to any organization regardless of where that bias originates. A decision made by an organization’s analytic model is a decision made by that entity’s senior management team. Internal audit can help management by providing risk-based and objective assurance, advice, and insight. As such, auditors should learn and adapt their methods to meet the challenges organizations face in adopting AI. 

Author: Allan Sammy, CIA, CPA, CGA, is director, Data Science and Audit Analytics, at Canada Post in Ottawa.​
Source: Internal Auditor magazine June 2019

Terug naar het nieuwsoverzicht

IIA Nederland

Burgemeester Stramanweg 102A
1101 AA Amsterdam
Contact opnemen

Audit Magazine

Audit Magazine


IIA is dé toonaangevende beroepsorganisatie voor internal auditors. Een lidmaatschap laat u delen in de collectieve kennis van alle vakgenoten in de wereld.
Meer informatie