How to overcome the potential for unintended bias in data algorithms Blog

How to overcome the potential for unintended bias in data algorithms

by 7wData
August 31, 2018

Anyone that has been online recently may have heard that some scary, biased algorithms are running wild, unchecked. They are sentencing criminals, deciding who gets fired, who gets hired, who gets loans, etc.

If you read many of the latest articles and books, it is natural to have a visceral negative response. Of course the prospect of racist, sexist robots making important decisions that affect people is terrifying.

While some of the media frenzy is warranted, these issues are not always so clear-cut. Like people, algorithms should not be stereotyped.

Algorithms have the potential to help us overcome rampant human bias. They also have the potential of magnifying and propagating that bias. I firmly believe this is an issue and it is the duty of data scientists to audit their algorithms to avoid bias.

However, even for the most careful practitioner, there is no clear-cut definition of what makes an algorithm “fair.” In fact, there are many competing notions of fairness among which there are trade-offs when it comes to dealing with real world data.

Let’s talk about three types of algorithms:

The less obvious cases described in #3 can get very interesting and controversial. Not all of these algorithms are running wild unchecked, and some have issues that are not the fault of the algorithm, but simply a reflection of what the world is like. How much are the things that matter in making the decision tied to demographic class?

Let’s say I run a bank and I don’t want to give a home loan (which we will assume is several hundred thousand dollars) to anyone who makes under $15,000 per year, this is a very simple algorithm. Most of us can agree that income is an important factor in the loan decision, but this will lead to varied treatment of different classes since income levels are distributed differently among ethnicity, gender, and age. If the outcome of my decision is that a smaller percentage of one group gets loans compared with another, many people would argue the simple algorithm is unfair.

What makes an algorithm “fair?” Let’s say I have a lot more data besides income - things like credit score, job history, etc. I have a large dataset of past outcomes to train an algorithm for future use.

Aiming for accuracy alone will almost definitely result in different treatment of people along age, race, and gender lines. To be fair, should I aim to approve the same percentage of people from each class, even if that means taking some risks?

Alternatively, I could train my algorithm to equalize the percentage of people from each class that get approved who actually paid back their loan (the true-positive rate which we can estimate from historical data).

A bit of a catch - if I do either of these things, I would have to hold the different groups to different standards. Specifically, I would have to say that I will issue a loan to someone of a certain class, but not to someone else of a different class with the exact same credentials, leading to yet another unfair scenario.

To see how this works with some data, I highly recommend playing with this interactive site created by Google with some artificial credit score data. When determining who gets a loan given that there are two subgroups with different credit score distributions in the data, there is no way to win.

Specifically, there is no situation where you can hold everyone to the same standard (credit score threshold), while also achieving the same approval percentage in each group and the same percentage of true positives (people who should get loans who actually get one).

Data can be biased because it is not diverse or representative, or it can just be “biased” because that is what the world is like - what we call that unequal base rates in the data.

Algorithms trained to associate words by their meaning and context (like word2vec) do not highly associate “woman” and “physicist.

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

How to overcome the potential for unintended bias in data algorithms

Leave a Reply Cancel reply

Upcoming Events

MarkLogic World | Amsterdam

Knowledge Graph — The Ultimate Center of Excellence

From Text to Value: Pairing Text Analytics and Generative AI

Bringing Data Closer to Decision Makers with Data Fabric

Categories

Tags

You Might Be Interested In

4 Ways How Blockchain Will Change the Retail Industry

The Line Between Data Lakes and Data Warehouses Is Blurring. Will It Disappear?

Safeguarding Your Career in the World of Automation

Recent Jobs

IT Engineer

Data Engineer

Applications Developer

D365 Business Analyst

Do You Want to Share Your Story?

Join our community

Our Services

Company

Work With Us

Follow Us

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.

How to overcome the potential for unintended bias in data algorithms

Leave a Reply Cancel reply

Upcoming Events

Categories

Tags

You Might Be Interested In

Recent Jobs

Do You Want to Share Your Story?

Join our community

Our Services

Company

Work With Us

Follow Us

Get the 3 STEPS

To Drive Analytics Adoption And manage change

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.

To Drive Analytics Adoption
And manage change