It might appear that data — the information you find in a scientific article, a history book, or a set of survey results — is just a collection of objective facts. The numbers, if sourced well, are supposed to be the hard truth untarnished by opinions, perspectives, or biases. In reality, this is rarely the case. Here is how AI can be just as biased as the humans creating it.
Data is only as good as the means used to acquire it, the methods employed to interpret and analyze it, and how it’s communicated.
Missteps in these areas filter down into the conclusions we reach based on our data — and often result in the kinds of subjectivity and bias we want our “facts” to avoid.
Even with something simple like a survey, many factors — from the wording of the questions to the way we define our terms and the sample set we evaluate to form a judgment — potentially bring bias into our data. Bias impacts the conclusions we reach.
“I know there’s a proverb which says, ‘To err is human,’ but a human error is nothing to what a computer can do if it tries.” — Agatha Christie
Automated systems and artificial intelligence, which are devoid of human feelings, ideologies, and preferences, promise to help us avoid the subjectivity and bias inherent to human reasoning. However, humans must design and build these systems, which opens them up to the very same pitfalls.
The fact remains that the bias in AI has become such a widespread issue that Salesforce recently added an AI bias course to its educational platform, Trailhead, to teach entrepreneurs and other professionals about the dangers of homogenous data.
The Consequences of AI Bias
Biases in automated systems have already had adverse effects on certain groups. For instance, Wisconsin attempted to utilize AI in its court system to determine the likelihood a criminal would re-offend in order to help a judge determine sentencing or parole.
Unfortunately, the program was biased against people of color and incorrectly flagged black individuals as more likely to commit another offense than their white counterparts.
What about peer-review studies?
In another case, a peer-reviewed study determined that Amazon’s facial recognition technology had a harder time identifying women and individuals with darker skin. How did this happen? The training set — the example data fed into the system to teach the AI programming — lacked diversity.
Summary of court cases?
My company’s data extraction software is used for many purposes. One such goal is summarizing court cases for legal professionals. For the program to offer accurate summaries, the training set must include a broad range of case types. If the system were trained on only tort cases, for example, it would be far less effective in summarizing criminal cases.
Machine Learning AI — and other AI learning that’s hard to spot.
These examples show just how damaging AI systems based on machine learning can be; when the data used to set them up fails to adequately represent what they’re studying or is less diverse than the cases they attempt to evaluate, the results can be unfortunate. But AI bias isn’t always so easy to spot.
In business, for example, using a biased dataset could result in leadership greenlighting a project that’s doomed to fail. When the project inevitably flops, the execution team might take the blame when it should be placed on the faulty assumptions the plan relied on upon in the first place.
AI and automation can be hugely beneficial for businesses, and especially companies that rely on technology to perform work that used to require teams of people.
However, if the technology isn’t constructed in a way that prevents bringing in bad data (or twists useful data), it might do more harm than good.
One proposed answer to this problem has been to encourage diversity across teams of data scientists — those who build the AI systems we rely on. The challenge is that we don’t always identify and promote the right kind of diversity to solve our bias problem.
Diversity Is More Than Skin Deep
When it comes to hiring data scientists, achieving what we usually think of as diversity (variance of ethnicity and gender) is not sufficient.
The value of diversity.
Of course, there’s value in having a workforce made up of men and women belonging to different ethnicities, religions, and cultures. However, diversity in these areas can still result in biased AI.
For example, a team of men and women from all around the world might build a system with gaps; they could all be trained using the same educational framework, for instance, or share the same personal views related to the items or issues in question.
People might share very similar approaches to solving a data problem if they’re products of the same culture, socioeconomic background, and political party. Homogeneity in these areas will inevitably bleed into the AI system the development team creates.
To avoid AI bias, we need to start thinking of diversity in terms of our ideas and points of view rather than just our skin tones and gender expressions. People who look different might very well think differently, too — but this isn’t always the case.
It’s impossible for humans to avoid altogether subjective preferences, mistaken beliefs, and various holes in our thought processes. So it might be similarly unlikely to eliminate bias. But there are still steps we can take to mitigate the impact our preferences (think bias) have on the systems we create.
Diversity of thought is an excellent place to start. To put together data science teams with the right kind of diversity, hiring managers must focus on finding people with unique perspectives and unconventional ideas.
These diverse attitudes, viewpoints, overviews, and objectivities often result from novel experiences and unorthodox training.
The challenge here, of course, is that a manager cannot ask candidates about their socioeconomic backgrounds, ages, cultural exposure, and religious and political beliefs during interviews.
How to diversify your team and all employees.
What they can do is ask more pointed questions related to the work potential hires would take on. For example, present candidates with an example of a data-modeling project. Ask them how they would go about framing the issue, gathering the data, and analyzing it.
If you already have team members who would approach the case using methods X and Y, consider it favorable if a candidate suggests a well-thought-out method Z.
An individual’s assertiveness is another key factor.
It does your team little good to add someone with a different way of thinking if she isn’t willing to speak up when she sees a problem. In order to avoid bias in analytical systems and AI tools, you need candidates who not only hold unique viewpoints but also know how to express them.
Lastly, it’s important to work to cultivate a culture of openness and collaboration within your tech development and data science teams. When they can easily express viewpoints, they can discover and correct errors.
Remember: When a team shares an overarching goal of being as objective as possible, bias is minimized in the process.
AI and automation are on the upswing, and these systems promise to help minimize the impact of human subjectivity and bias while increasing our efficiency and productivity. To accomplish this goal, however, they need to be constructed with great care.
To move forward with care — we must take diversity very seriously. This means more than a simple focus on appearances when hiring data scientists and system architects. Leaders need to dig deeper, work to bring in different points of view and people who can express them, and support an inclusive development culture.
Teams with diverse ideas and a collaborative structure are far more likely to build AI systems that will truly transcend the biases that we humans struggle to eliminate in ourselves.
Image credit: Unsplash; dhaval-parmar