Your Data Is Biased, Here’s Why
- by 7wData
Ransomware is one of the fastest growing types of malware, and new breeds that escalate quickly ar
Biased data can lead to bad decisions. Most business leaders aren't aware of the problem just yet, but they need to be because they're ultimately responsible.
Bias is everywhere, including in your data. A little skew here and there may be fine if the ramifications are minimal, but bias can negatively affect your company and its customers if left unchecked, so you should make an effort to understand how, where and why it happens.Â
"Many [business leaders] trust the technical experts but I would argue that they're ultimately responsible if one of these models has unexpected results or causes harm to people's lives in some way," said Steve Mills, a principal and director of machine intelligence at technology and management consulting firm Booz Allen Hamilton.
In the financial industry, for example, biased data may cause results that offend the Equal Credit Opportunity Act (fair lending). That law, enacted in 1974, prohibits credit discrimination based on race, color, religion, national origin, sex, marital status, age or source of income. While lenders will take steps not to include such data in a loan decision, it may be possible to infer race in some cases using a zip code, for example.
"The best example of [bias in data] is the 2008 crash in which the models were trained on a dataset," said Shervin Khodabandeh, a partner and managing director of Boston Computing Group (BCG) Los Angeles, a management consulting company. "Everything looked good, but the datasets changed and the models were not able to pick that up, [so] the model collapsed and the financial system collapsed." Â
What Causes Bias in Data
A considerable amount of data has been generated by humans, whether it's the diagnosis of a patient's condition or the facts associated with an automobile accident. Quite often, individual biases are evident in the data, so when such data is used for machine learning training purposes, the machine intelligence reflects that bias. A prime example of that was Microsoft's infamous AI bot, Tay, which in less than 24 hours adopted the biases of certain Twitter members. The results were a string of shocking, offensive and racist posts.
"There's a famous case in Broward County, Florida, that showed racial bias," said Mills. "What appears to have happened is there was historically racial bias in sentencing so when you base a model on that data, bias flows into the model. At times, bias can be extremely hard to detect and it may take as much work as building the original model to tease out whether that bias exists or not."
What Needs to Happen
Business leaders need to be aware of bias and the unintended consequences biased data may cause. In the longer-term view, data-related bias is a governance issue that needs to be addressed with the appropriate checks and balances which include awareness, mitigation and a game plan should matters go awry.
"You need a formal process in place, especially when you're impacting people's lives," said Booz Allen Hamilton's Mills.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More