Technology has disrupted the way we have been living our lives, undoubtedly. With all the good it has vowed to bring, it has been able to get rid of pernicious institutions i.e. racism and sexism. Artificial Intelligence or AI, the Messiah, is touted to become the modern man's saviour, by liberating them from the messy and clunky process of decision-making. But is that the case though?
To put it simply, Artificial Intelligence is nothing but a string of self-devouring code that learns from the historical data, logic, and decisions. So what that means is that the more data you feed it the more it is likely to make the right choice, decision or provide the right outcome, right? But what if the data that you have been feeding it is skewed? What if the training dataset is already manipulated? Would you still expect the outcome to be best? No, you won't. And that's exactly what's happening now. Most of the data we are feeding to AIs is already biased in some way making the outcome biased as well. This very thing happened when a team of researchers from MIT was working on an income prediction model. They found out that the system was twice as likely to miscategorise the income of women as 'low-income' and that of male employees as 'high-income.' In order to correct the bias, researchers had to increase the dataset by a factor of 10 which decreased the bias by 40 percent. This is just one example. The AI-based services of the BIG FOUR we use perennially have subtle racism and sexism in it. If you use Google Translate, to translate from gender-neutral language to gender-biased language, it will mostly like convert most of the pronounces to the male-based ones.
COMPAS, an AI used by the US judicial courts, has been at the receiving end of a tremendous backlash after researchers revealed that the AI, based on the racial profiles of convicts, provides skewed predictions on how likely a defendant will be in regards to ending up in jail again. And this is just the tip of the iceberg. The actual problem is much more deep-rooted. Attention should be paid when training the system, particularly in case of large datasets. As datasets are rigged with an imbalance of social infrastructure and bias, these data need to be checked.
And the system should also include information on methods of data collection and how they were annotated. Datasets accompanied by associated metadata can help to weed out the bias as well.
If we don't put the systems through unbiased data, a future with no bias will become bleaker.