The real problem of AI bias

Recently, Amazon tried to create a system with machine learning to automate the screening of resumes for hiring.

Amazon’s workforce is currently heavily skewed towards men, so successful past hires will also be skewed towards “male” data. Naturally, we ended up with a system with that kind of bias.

Amazon realized that, and so that system was never deployed as a production system.

The most important thing here is that the system had a bias, even though the obvious information that could identify a man or woman had been left out of the resume data beforehand.

The system looked at the sample data of “successful employees” and found a different pattern.

For example, women might describe what they have accomplished differently than men. I may have played a different sport when I was in school. The system doesn’t know what’s ice hockey, what’s “people” or what’s “success”. I’m just doing a statistical analysis based on the data in the text.

However, the patterns it sees are not always the kind of things that a human being would notice. We already know that there are differences between men and women in the way they use words to talk about success. But even so, it’s hard to find such biases in the data.

So far, we’ve been talking about “people” and “the nature of people”. A lot of the discussion in the world is centered around this.

But bias about “people” is only part of the problem. We use machine learning for more than just humans. And there is a bias in all such things. Furthermore, even if you are creating a system that is about “people,” the biases that go into the data are not necessarily about people.

Let’s consider the following three cases to understand this

There is no such thing as the same number of different types of people. So it’s a case of the proportion of people with different skin tones in your photos being out of balance. A system built on such data can lead to incorrect predictions based on skin pigmentation.
It’s a case of your data containing things that aren’t particularly descriptive, but are reasonably prominent, and they’re scattered with bias. It’s the kind of case where a system is created that learns based on a ruler in a picture of skin cancer, or a meadow in a flock of sheep.
It is the case that your data may contain features that humans can’t “even try” to find.
By the way, what do you mean by this last “even if you try to find it”?

We have a thing called prior knowledge, which is what we should be aware of. For example, “There are differences between men and women.

If we look at a picture of a ruler, we can see the ruler. And they will ignore it. Because I don’t think that’s worthy of our attention. But the system does not have any such prior knowledge. And this is something that we humans tend to forget.

What if the photos of unhealthy skin were taken in an office with incandescent electricity, and all the photos of healthy skin were taken under the light of fluorescent lights?

Or maybe your phone’s OS was upgraded before and after you took a photo of healthy skin and a photo of unhealthy skin. And in that upgrade, Apple (or Google) may have made some small changes to keep the noise down.

This may be such a difference that no matter how hard we try, we are invisible to the human eye. However, it’s a difference that can be easily identified in a machine learning system, and that difference is going to be used.