Algorithmic Bias

When algorithms emulate and amplify stereotypes

Erima Goyal

Published in

Towards Data Science

5 min readJan 29, 2021

Who survived the Titanic disaster?

Recall from the Titanic movie, where the men near lifeboats were screaming — WOMEN AND CHILDREN FIRST! Now as some of the budding data scientists explore the famous Titanic dataset (also available on Kaggle) to predict who survived, within the first few steps of exploratory data analysis one would conclude that women and children had a higher probability of surviving. The numbers also suggest that the people seated in the higher class had a higher chance of surviving as compared to the ones seated in the lower class, eluding to bribing or status bias.

Now, if someone randomly asks me — Did ABC survive the Titanic disaster? My response would be was ABC a female and was he/she rich? I just magnified the financial status bias and gender bias that my cognitive system had built from watching the movie and exploring the dataset.

If hypothetically, the Titanic dataset was to be used by a travel agency for cruise insurance, it would predict a higher survival rate for women as compared to the survival rate for men; thereby, charging lower premiums to women as compared to men.

Bias in mainstream applications

Bias and the credit limits
In 2019, Apple launched credit cards in conjunction with Goldman Sachs. It took a wrong turn when tech entrepreneur David Heinemeier Hansson tweeted that he was approved for a credit limit that was 20 times higher than the limit that his wife was approved of, even though they filed joint taxes. Many others, including Apple co-founder Steve Wozniak, said the same thing happened to them too. In this situation, if algorithms were the ones predicting the credit limit, keeping everything else constant (taxes, assets, and liabilities), the credit limit for a woman was predicted to be less than that for a man.

Bias and job portals
A few years back, Amazon created an internal recruiting tool that ranked resumes based on their fit for the job description/historical role definitions. It was later discovered that the tool was not generating gender-neutral rankings. Women were getting lower ranks as compared to men when accessed by this tool.

Bias and facial recognition
Many facial recognition technologies have entered the mainstream in the form of biometric payments, airport immigration checks, in-built applications in cameras, and various social media networks. However, there have been increasing incidents where people of color are misidentified by these facial recognition applications. Reuters article: U.S. government study finds racial bias in facial recognition tools. The tool would identify gender for a white person with much higher accuracy as compared to the prediction for a non-white person.

How does it happen?

Bias in training data
In the case of Apple credit card, most likely it is the algorithms, trained on historical data, predicting the credit limit for its new customers. Historically, prior to algorithms taking it over, humans in the form of underwriters were making the credit limit decisions. Maybe, there was a historical nonintentional human bias. The underwriters, due to several reasons such as higher job security amongst men and women being more susceptible to longer job breaks for childbirth and child care, could have given higher credit limits to the men and lower credit limits to the women.

In the case of the recruiting tool, the models were trained on historical resumes that mostly came from men, with tech being a male-dominated industry. Hence, the tool penalized words like ‘women’s chess club’ etc. on the resumes, thereby resulting in lower rankings for women.

When algorithms train on historical data, they echo and compound stereotypes/mistakes that exist in society/industry, thereby, deprioritizing a certain population segment.

Limited variability in training data
In the case of bias in facial recognition algorithms, one of the key reasons for racial bias getting injected into algorithms is the fact that the training data was largely skewed towards the white population. If the training data had appropriate representation from all segments of the population, the bias, probably, would not creep in.

So if we were to train a breast cancer algorithm that has a female population only, then it would throw wacky results if some men were thrown into the prediction data. Don’t get me wrong, but breast cancer can happen in men too!

Direct bias features
As data scientists, when we are developing algorithms, our tendency is to get as high prediction accuracy as possible. We advent on a feature extraction journey and throw everything possible (gender, income, address) in the model to increase accuracies. We rarely give it a thought — is it ethical to make this prediction based on gender? If historically, humans considered gender — consciously or subconsciously — to make credit decisions, shall my algorithm do it too? Including gender as a feature is not always a bad thing. On some occasions, such as detecting the probability of breast cancer, including a gender feature would be a very appropriate decision.

Indirect bias features
While variables such as gender and income directly introduce bias, some other variables such as club memberships, sorority/fraternity affiliations are indirect variables that can introduce bias. Affiliations to fraternity clubs would mean the customer is a male and sorority affiliation would mean that the customer is a female, membership payments to an Indian club in transaction data might indirectly profile the customer as an Asian. All these variables, if not de-biased, cover for one or the other bias.

What the future entails?

For data scientists and developers
As we chase the accuracy maximization goal, we should look at some other outcomes of the process. For example, most tree-based algorithms (decision trees, random forest, xgboost, etc) produce feature importance metrics, which if properly analyzed can pre-empt any bias that is creeping into the algorithms.

For academia
Traditionally, overseeing regulations around bias in the society, workplace and other locations have sat in the legal departments. May be academia needs to create specialized law programs that train lawyers to understand bias in algorithms and perform a “law audit” of machine learning products. With appropriate knowledge of algorithms, legal should be part of the AI teams.

For a wider AI network
Similar to the series of algorithms, ranging from logistic regression to CNN, to automate myriads of backend decision making and predictions, we should create a series of algorithms — each detecting a different kind of bias. For example, an XGBoost for racial bias in the model, deep learning for income bias, and so on.