This a follow-up to our previous blog on Biased Machine-Learning Models, in which we explored how machine-learning algorithms can be susceptible to bias (and what you can do to avoid that bias). Now, we’ll examine the impact biased predictive models can have on specific processes such as hiring.
Just recently, Amazon had to scrap a predictive recruiting tool because it was biased against women. How does something like this happen? Because algorithms learn rules by looking at past data—and if the historical data is biased, the model is going to be biased as well. An even larger problem is that a machine-learning model will continue to automate the process of being biased.
When Does Bias Become an Issue?
A company itself may not be biased in the hiring process, but the current constituents on the team will dictate how the algorithm scores applicants. The algorithm is adding bias to its score because of the imbalance that already exists in the input data, and this creates a problem for filtering out candidates in the future.
Let’s look at hiring scenarios where bias might become an issue. Say your sales team is comprised of mostly 25-year-old white males. The algorithm will then interpret this as the ideal profile for a salesperson (age 25, white, male). If a female or someone older than 25 applies, the algorithm will not give them a good score. Similarly, say your accounting team is comprised of mostly 35-year-old women. Any males or younger women who apply will also score low.
Outside of the hiring process, the same logic can be applied to retail establishments renting out houses or even financial institutions approving loans. There may not be bias in the manual business workflow, but the historical data can create a bias in the automated predictions. For example, if an applicant lives in a neighborhood with a high concentration of young, educated Asians, the algorithm may penalize anyone who does not fit this demographic.
What Can We Do About It?
There are a variety of ways to deal with biased machine-learning models. First, look for and acknowledge any biased data. Next, add sampling techniques like Under, Over, Smote or Rose sampling methods. You can also add class weights to solve such problems, especially to increase diversity. Or, to keep it simple, simply remove age, gender, and race as inputs to the model.
To learn more techniques for handling biased data, see our previous blog on biased machine-learning models.