Artificial intelligence capabilities are expanding exponentially, with AI now being utilized in industries from advertising to medical research. The use of AI in more sensitive areas such as facial recognition software, hiring algorithms, and healthcare provision, have precipitated debate about bias and fairness.
Bias is a well-researched facet of human psychology. Research regularly exposes our unconscious preferences and prejudices, and now we see AI reflect some of these biases in their algorithms.
So, how does artificial intelligence become biased? And why does this matter?
How Does AI Become Biased?
For the sake of simplicity, in this article, we'll refer to machine learning and deep learning algorithms as AI algorithms or systems.
Researchers and developers can introduce bias into AI systems in two ways.
Firstly, the cognitive biases of researchers can be embedded into machine learning algorithms accidentally. Cognitive biases are unconscious human perceptions that can affect how people make decisions. This becomes a significant issue when the biases are regarding people or groups of people and can harm those people.
These biases can be introduced directly but accidentally, or researchers might train the AI on datasets that were themselves affected by bias. For instance, a facial recognition AI could be trained using a dataset that only includes light-skinned faces. In this case, the AI will perform better when dealing with light-skinned faces than dark. This form of AI bias is known as a negative legacy.
Secondly, biases can arise when the AI is trained on incomplete datasets. For instance, if an AI is trained on a dataset that only includes computer scientists, it will not represent the entire population. This leads to algorithms that fail to provide accurate predictions.
Examples of Real World AI Bias
There have been multiple recent, well-reported examples of AI bias that illustrate the danger of allowing these biases to creep in.
US-Based Healthcare Prioritization
In 2019, a machine learning algorithm was designed to help hospitals and insurance companies determine which patients would benefit most from certain healthcare programs. Based on a database of around 200 million people, the algorithm favored white patients over black patients.
It was determined that this was because of a faulty assumption in the algorithm regarding varying healthcare costs between black and white people, and the bias was eventually reduced by 80%.
COMPAS
The Correctional Offender Management Profiling for Alternative Sanctions, or COMPAS, was an AI algorithm designed to predict whether particular people would re-offend. The algorithm produced double the false positives for black offenders compared with white offenders. In this case, both the dataset and model were flawed, introducing heavy bias.
Amazon
The hiring algorithm that Amazon uses to determine the suitability of applicants was found in 2015 to favor men over women heavily. This was because the dataset almost exclusively contained men and their resumes since most Amazon employees are male.
How to Stop AI Bias
AI is already revolutionizing the way we work across every industry. Having biased systems controlling sensitive decision-making processes is less than desirable. At best, it reduces the quality of AI-based research. At worst, it actively damages minority groups.
There are examples of AI algorithms already being used to aid human decision-making by reducing the impact of human cognitive biases. Because of how machine learning algorithms are trained, they can be more accurate and less biased than humans in the same position, resulting in fairer decision-making.
But, as we’ve shown, the opposite is also true. The risks of allowing human biases to be cooked into and amplified by AI may outweigh some of the possible benefits.
At the end of the day, AI is only as good as the data that it’s trained with. Developing unbiased algorithms requires extensive and thorough pre-analysis of datasets, ensuring that data is free from implicit biases. This is harder than it sounds because so many of our biases are unconscious and often hard to identify.
Challenges in Preventing AI Bias
In developing AI systems, every step must be assessed for its potential to embed bias into the algorithm. One of the major factors in preventing bias is ensuring that fairness, rather than bias, gets “cooked into” the algorithm.
Defining Fairness
Fairness is a concept that’s relatively difficult to define. In fact, it’s a debate that’s never reached a consensus. To make things even more difficult, when developing AI systems, the concept of fairness has to be defined mathematically.
For instance, in terms of the Amazon hiring algorithm, would fairness look like a perfect 50/50 split of male to female workers? Or a different proportion?
Determining the Function
The first step in AI development is to determine exactly what it is going to achieve. If using the COMPAS example, the algorithm would predict the likelihood of criminals reoffending. Then, clear data inputs need to be determined to enable the algorithm to work. This may require defining important variables, such as the number of previous offenses or the type of offenses committed.
Defining these variables properly is a difficult but important step in ensuring the fairness of the algorithm.
Making the Dataset
As we’ve covered, a major cause of AI bias is incomplete, non-representative, or biased data. Like the case of facial recognition AI, the input data needs to be thoroughly checked for biases, appropriateness, and completeness before the machine learning process.
Choosing Attributes
In the algorithms, certain attributes can be considered or not. Attributes can include gender, race, or education—basically anything that may be important to the algorithm’s task. Depending on which attributes are chosen, the predictive accuracy and bias of the algorithm can be severely impacted. The problem is that it’s very difficult to measure how biased an algorithm is.
AI Bias Isn't Here to Stay
AI bias occurs when algorithms make biased or inaccurate predictions because of biased inputs. It occurs when biased or incomplete data is reflected or amplified during the development and training of the algorithm.
The good news is that with funding for AI research multiplying, we’re likely to see new methods of reducing and even eliminating AI bias.