Development Tips

Biased Machine-Learning Models (Part 2):
Impact on the Hiring Process

By Sriram Parthasarathy
Share on LinkedIn Tweet about this on Twitter Share on Facebook

This a follow-up to our previous blog on Biased Machine-Learning Models, in which we explored how machine-learning algorithms can be susceptible to bias (and what you can do to avoid that bias). Now, we’ll examine the impact biased predictive models can have on specific processes such as hiring.

>> Related: Predictive Analytics 101 <<

Just recently, Amazon had to scrap a predictive recruiting tool because it was biased against women. How does something like this happen? Because algorithms learn rules by looking at past data—and if the historical data is biased, the model is going to be biased as well. An even larger problem is that a machine-learning model will continue to automate the process of being biased.

When Does Bias Become an Issue?

A company itself may not be biased in the hiring process, but the current constituents on the team will dictate how the algorithm scores applicants. The algorithm is adding bias to its score because of the imbalance that already exists in the input data, and this creates a problem for filtering out candidates in the future.

Let’s look at hiring scenarios where bias might become an issue. Say your sales team is comprised of mostly 25-year-old white males. The algorithm will then interpret this as the ideal profile for a salesperson (age 25, white, male). If a female or someone older than 25 applies, the algorithm will not give them a good score. Similarly, say your accounting team is comprised of mostly 35-year-old women. Any males or younger women who apply will also score low.

Biased Candidate Model

Outside of the hiring process, the same logic can be applied to retail establishments renting out houses or even financial institutions approving loans. There may not be bias in the manual business workflow, but the historical data can create a bias in the automated predictions. For example, if an applicant lives in a neighborhood with a high concentration of young, educated Asians, the algorithm may penalize anyone who does not fit this demographic.

What Can We Do About It?

There are a variety of ways to deal with biased machine-learning models. First, look for and acknowledge any biased data. Next, add sampling techniques like Under, Over, Smote or Rose sampling methods. You can also add class weights to solve such problems, especially to increase diversity. Or, to keep it simple, simply remove age, gender, and race as inputs to the model.

Best Candidate Model

To learn more techniques for handling biased data, see our previous blog on biased machine-learning models.

Originally published October 19, 2018; updated on July 31st, 2020

About the Author

Sriram Parthasarathy is the Senior Director of Predictive Analytics at Logi Analytics. Prior to working at Logi, Sriram was a practicing data scientist, implementing and advising companies in healthcare and financial services for their use of Predictive Analytics. Prior to that, Sriram was with MicroStrategy for over a decade, where he led and launched several product modules/offerings to the market.