Deborah Raji – a prominent Nigerian-Canadian computer scientist – recently took to Twitter to push back against increasingly testy discussions around algorithmic bias amongst computer scientists and the AI community. In this thread, she laid out a remarkably clear explanation of what is meant by algorithmic bias, and why this cannot be simply ‘fixed’ by better data practices and technical rigour.
Here is a summary of her tweets. As she started off by confessing … “I’m wary of wading into this conversation again, but it’s important to acknowledge the research”
While growing numbers of people are willing to acknowledge that data is not neutral per se, there is still work to do here. One popular response from some technologists is that algorithms and AI models are not biased in and of themselves. Instead, algorithms and AI models simply amplify bias that might have crept into the data-sets that they are trained in, and/or the data that they are fed. As such, it might appear that any data-driven bias is ultimately fixable with better data.
As Deb Raji describes, this is not the case. Of course, it is right to acknowledge that the initial generation of data can reflect historical bias, and that the data-sets used to develop algorithmic models will often contain representation and measurement bias. However, every aspect of an algorithmic system is a result of programming and design decisions, and can therefore contain additional biases. These include decisions about how tasks are conceived and codified, as well as how choices are modelled. In particular, algorithmic models are also subject to what are termed aggregation biases and evaluation biases. All told, any outcome of an algorithmic model is shaped by a lot of subjective human judgements, interpretations and discretionary decisions along the way.
But this is not the end of the story! We also have to consider how algorithmic outputs and predictions are then used. Alex Allbright points to the ways in which system recommendations are perceived differently by the humans tasked with acting upon then – for example, Kentucky judges more frequently over-riding predictive recommendations to grant bail to black defendants as compared to white defendants. As Allbright puts it, rather than focus on the biases of predictive tools in isolation, we also need to consider how they are used in different contexts – not least social settings and institutional systems that are “chock-full” of human judgments, human discretions and human biases.
So, we can have biased data-sets, biased models, and biased contexts within which any algorithmic system is used. This means that algorithmic ‘bias’ is not simply a technical data problem, but a sociotechnical problem of humans and data … and therefore not something that can ever be ‘fixed’. Humans will always act in subjective ways, our societies will always be unequal and discriminatory. As such, our data-driven tools will inevitably be at least as flawed as the worldviews of the people who make them and use them. Moreover, our data-driven tools are most likely to amplify existing differences and unfairness, unless they are deliberately designed to be biased towards more inclusive outcomes and ‘positive discrimination. Either way, there cannot be a completely objective, neutral and value-free system – our societies and our technologies do not work like that!
Raji reminds us that computer scientists, technologists and data scientists often still take offense when confronted with talk of ‘algorithmic basis’ because of differences in terminology. In technical data terms, people are often trained to think about ‘bias’ in very precise terms of statistical bias – one of the data issues that any AI scientist will strive to minimise when developing accurate algorithms and machine learning models. As Kate Crawford puts it, “we are speaking different languages when we talk about bias”
To avoid such confusion, talk of social ‘bias’ is perhaps better framed as fairness, harm and discrimination. In computer science terms, these conversations are perhaps best understood as a problem with classification, and the classifications that technology developers ascribe to people and their social contexts. Addressing issues of statistical bias, under-representation and variance in data-sets is not the main problem being discussed here!