Wednesday, July 31, 2024

What if discussions of bias in AI were mostly biased themselves?


Bias in AI is too often a bee in someone’s bonnet. Bias is all too often what someone expects it to be. What if discussions of bias in AI were mostly biased themselves? 

When AI is mentioned, it is only a matter of time before the word ‘bias’ is heard, especially in debates around AI in education. And when I ask for clear example, I don't get examples from learning. It is not that it is not an issue – it is. But it is complex and not nearly as bad or consequential as most think.

The OCED asked me to contribute to a project on AI in learning this week. They are nice people but their obsession with the word ‘bias’ bordered on the bizarre. Their whole world view was framed by this one word. I had to politely say ‘no thanks’ as I’m not keen on projects where the focus is not on objectivity but shallow moralising. My heart also sinks when I all too commonly get the ‘Isn’t AI biased?’ type question after Keynotes at conferences. It’s the goto question for the disaffected. No matter how patiently you answer, you can see they’re not listening.

To be fair AI is for most an invisible force, that part of the iceberg that lies below the surface and AI is many things, can be opaque technically and causality difficult to trace. That’s why the issue needs some careful unpacking.

Argument from anecdote/meme

You also hear the same very old examples being brought up time and time again: black face/gorilla, recruitment and reoffender software. Most of these examples have their origin in Cathy O’Neil’s Weapons of Math destruction or internet memes. This simple contrarianism lies behind much of the debate, fuelled by one book…

Weapons of 'Math' Destruction was really just a sexed up dossier on AI. Unfortunate title, as O’Neil’s supposed WMDs are as bad as the now mythical WMDs, the evidence similarly weak, sexed up and cherry picked. This is the go-to book for those who want to stick it to AI by reading a pot-boiler. But rather than taking an honest look at the subject, O’Neil takes the ‘Weapons of Math Destruction’ line too literally, and unwittingly re-uses a term that has come to mean exaggeration and untruths. She tries too hard to be a clickbait contrarian.

The first example concerns a teacher who is supposedly sacked because an algorithm said she should be sacked. Yet the true cause, as revealed by O’Neil, are other teachers who have cheated on behalf of their students in tests. Interestingly, they were caught through statistical checking, as too many erasures were found on the test sheets. The second is worse. Nobody really thinks that US College Rankings are algorithmic in any serious sense. The ranking models are quite simply statistically wrong. It is a straw man, as they use subjective surveys and proxies and everybody knows they are gamed. Malcolm Gladwell did a much better job in exposing them as self-fulfilling exercises in marketing. In fact. most of the problems uncovered in the book, if one does a deeper analysis, are human.

The chapter headings are also a dead giveaway - Bomb Parts, Shell Shocked, Arms Race, Civilian Casualties, Ineligible to serve, Sweating Bullets, Collateral Damage, No Safe Zone, The Targeted Civilian and Propaganda Machine. This is not 9/11 and the language of WMDs is hyperbolic.

At times O’Neil makes good points on ‘data' – small data sets, subjective survey data and proxies – but this is nothing new and features in any 101 statistics course. The mistake is to pin the bad data problem on algorithms and AI – that’s often a misattribution. Time and time again we get straw men in online advertising, personality tests, credit scoring, recruitment, insurance, social media. Sure problems exist but posing marginal errors as a global threat is a tactic that may sell books but is hardly objective. In this sense, O'Neil plays the very game she professes to despise - bias and exaggeration.

The final chapter is where it all goes badly wrong, with the laughable Hippocratic Oath. Here’s the first line in her imagined oath “I will remember that I didn’t make the world, and it doesn’t satisfy my equations” a flimsy line. There is, however one interesting idea – that AI be used to police itself. A number of people are working on this and it is a good example of seeing technology realistically, as being a force for both good and bad, and that the good will triumph if we use it for human good. This book relentlessly lays the blame at the door of AI for all kinds of injustices, but mostly it exaggerates or fails to identify the real, root causes. It provides exaggerated analyses and rarely the right answers. 

Argument using ‘downsides’ only

The main bias in debates on bias is a form of one-sidedness in the analysis. Most technology has a downside. We drive cars, despite the fact that 1.2 million people die gruesome and painful deaths every year from in car accidents – world war level figures, not counted those with life changing injuries. Rather than tease out the complexity, even comparing upsides with downsides, we are given the downsides only. The proposition that ALL algorithms and data are biased is as foolish as the idea that all algorithms and data are free from bias. This is a complex area that needs careful thought and the real truth lies, as usual, somewhere in-between. Technology often has this cost-benefit feature. To focus on just one side is quite simply a mathematical distortion.

Anthropomorphic arguments

We have already mentioned Anthropomorphic bias, where reading ‘bias’ into software is often the result of this over-anthropomorphising. Availability bias arises when we frame thoughts on what is available, rather than pure reason. So crude images of robots enter the mind as characterising AI, or fixed databases where data is retrieved, as opposed to complex LLMs, vector databases or software or mathematics, which is not, for most, easy to call to mind or visualise. This skews our view of what AI is and its dangers, often producing dystopian ‘Hollywood’ perspectives, rather than objective judgement. Then there’s Negativity bias, where the negative has more impact than the positive, so the Rise of the Robots and other dystopian visions come to mind more readily than positive examples such as fraud detection or cancer diagnosis. Most of all we have Confirmation bias, that leaps into action whenever we hear of something that seems like a threat and we want to confirm our view of it as ethically wrong. Indeed, the accusation that all algorithms are biased is often (not always) a combination of ignorance about what algorithms are and a combination of these four human biases – anthropomorphism, availability, negativity, confirmation and anthropomorphism bias. It is often a sign of bias in the objector, who wants to confirm their own deficit-based weltanschauung and apply a universal, dystopian interpretation to AI with a healthy dose of neophobia (fear of the new).

Argument ignoring human biases

Too many discussions around bias in AI ignore the baseline, the existing state of affairs. In learning that is human delivery. Here we must recognise that human teachers and learners are packed with biases. Daniel Kahneman got the Nobel Prize for identifying many and it was already a long list.

First, it is true that ALL humans are biased, as shown by Nobel Prize winning psychologist Daniel Kahneman and his colleague Amos Tversky, who exposed a whole pantheon of biases that we are largely born with and are difficult to shift, even through education and training. Teaching is soaked in bias. There is socio-economic bias in policy as it is often made by those who favour a certain type of education. Education can be bought privately introducing inequalities. Gender, race and socio-economic bias is often found in the act of teaching itself. We know that gender bias is present in subtly directing girls away from STEM subjects and we know that children from lower socio-economic groups are treated differently. Even, so-called objective assessment is biased, often influenced by all sorts of cognitive factors – content bias, context bias, marking bias and so on.

One layer of human bias heavily influences the bias debate. Rather than look at the issue dispassionately we get well identified neophobia biases; confirmation bias, negativity bias, availability and confirmation biases and so on. There’s also ‘status quo’ bias, a cognitive bias where people prefer the current state of affairs to change, even when change might lead to a potentially better outcome. Familiarity and comfort with the current situation play a significant role in maintaining the status quo as well as needing significant cognitive effort to reflect, examine the benefits and accept the change. This is common in financial decision making and, especially, new technology.

Remember that AI is ‘competence without comprehension’ competences that can be changed, whereas all humans have cognitive biases, which are difficult, to change, ‘uneducable’ to use the term from Kahneman in Thinking Fast and Slow. AI is just maths, software and data. This is mathematical bias, for which there are definitions. It is easy to anthropomorphize these problems by seeing one form of bias as the same as the other. That aside, mathematical bias can be built into algorithms and data sets. What the science of statistics, and therefore AI, does, is quantify and try to eliminate such biases. This is therefore, essentially, a design problem.

Arguments assuming ALL AI is biased

You are likely in your first lesson on algorithms to be taught some sorting mechanisms (there are many). Now it is difficult to see how sorting a set of random numbers into ascending order can be either sexist or racist. The point is that most algorithms are benign, doing a mechanical job and free from bias. They can improve performance in terms of strength, precision and performance over time (robots in factories), compressing and decompressing comms, encryption algorithms, computational strategies in games (chess, GO, Poker and so on), diagnosis-investigation-treatment in healthcare and reduced fraud in finance. Most algorithms, embedded in most contexts are benign and free from bias or the probability of bias is so small it is irrelevant.

Note that I said ‘most’ not ‘all’. It is not true to say that all algorithms and/or data sets are biased, unless one resorts to the idea that everything is socially constructed and therefore subject to bias. As Popper showed, this is an all-embracing theory to which there is no possible objection, as even the objections are interpreted as being part of the problem. This is, in effect, a sociological dead-end.

Arguments conflating human and statistics bias

Al is not conscious or aware of its purpose. It is just software, and as such, is not ‘biased’ in the way we attribute that word to ‘humans’. The biases in humans have evolved over millions of years with additional cultural input. AI is maths and we must be careful about anthropomorphising the problem. There is a definition of ‘bias’ in statistics, which is not a pejorative term, but precisely defined as the difference between an expected value and the true value of a parameter. If the value is zero, it is called unbiased. This is not so much bias as a precise recognition of differentials.

However, human bias can be translated into other forms of statistical or mathematical bias. One must now distinguish between algorithms and data. There is no exact mathematical definition of ‘algorithm’ where bias is most likely to be introduced through weightings and techniques used. Data is where most of the problems arise. One example is poor sampling; too small a sample, under-representations or over-representations. Data collection can also have bias due to faulty data gathering in the instruments themselves. Selection bias in data occurs when it is gathered selectively and not randomly.

The statistical approach at least recognises these biases and adopts scientific and mathematical methods to try to eliminate these biases. This is a key point – human bias often goes unchecked, statistical and mathematical bias is subjected to rigorous checks. That is not to say that it is flawless but error rates and attempts to quantify statistical and mathematical bias have been developed over a long time, to counter human bias. That is the essence of the scientific method.

Guardrails

Guardrails are human interventions, along with post-model training by humans. These introduce problems with bias introduced by humans. their political and ideological bias led to Google withdrawing their image generation tool and tere are many examples of favoured politicians and political views surfacing.

Accusation about race

The most valuable companies in the world are AI companies, in that their core strategic technology is AI. As to the common charge that AI is largely written by white coders, I can only respond by saying that the total number of white AI coders is massively outgunned by Chinese, Asian and Indian coders. The CEOs of Microsoft and Alphabet (Google) were both born and educated in India. And the CEOs of the three top Chinese tech companies are Chinese. Having spent some time in Silicon Valley, it is one of the most diverse working environments I’ve seen in terms of race. We can always do better but this should, in my view not be seen as a critical ethical issue. 

AI is a global phenomenon, not confined to the western world. Even in Silicon Valley the presence of Chinese, Asian and Indian programmers is so prevalent that they feature in every sitcom on the subject. In addition, the idea that these coders are unconsciously, or worse, consciously creating racist and sexist algorithms is an exaggeration. One has to work quite hard to do this and to suggest that ALL algorithms are written in this way is another exaggeration. Some may, but most are not.

Accusations of sexism

Gender is an altogether different issue and a much more intractable problem. There seems to be bias in the educational system among parents, teachers and others to steer girls away from STEM subjects and computer studies. But the idea that all algorithms are gender-biased is naïve. If such bias does arise one can work to eliminate the bias. Eliminating human gender bias is much more difficult.

True there is a gender differential, and this will continue, as there are gender differences when it comes to focused, attention to detail coding in the higher echelons of AI programming. We know that there is a genetic cause of autism, a constellation (not spectrum), of cognitive traits and that this is weighted towards males (and no it is not merely a function of underdiagnosis in girls). For this reason alone, there is likely to be a gender difference in high-performance coding teams and AI innovators for the foreseeable future. 

AI and transparency

It is true that some AI is not wholly transparent, especially deep learning using neural networks. However, we shouldn’t throw out the baby with the bathwater… and the bath. We all use Google and academics use Google Scholar, because they are reliably useful. They are not transparent. The problems arise when AI is used to say, select or assess students. Here, we must ensure that we use systems that are fair. A lot of work is going into technology that interprets other AI software and reveals their inner workings.

A common observation in contemporary AI is that its inner workings are opaque, especially machine learning using neural networks. But compare this to another social good – medicine. We know it works but we don’t know how. As Jon Clardy, a professor of biological chemistry and molecular pharmacology at Harvard Medical School says, "the idea that drugs are the result of a clean, logical search for molecules that work is a ‘fairytale'”. Many drugs work but we have no idea why they work. Medicine tends to throw possible solutions at problems, then observe if it works or not. Now most AI is not like this but some is. We need to be careful about bias but in many cases, especially in education, we are more interested in outputs and attainment, which can be measured in relation to social equality and equality of opportunity. We have a far greater chance of tackling these problems using AI than by sticking to good, old-fashioned bias in human teaching.

Confusing early releases with final product 

Nass and Reeves through 35 studies in The Media Equation showed that the temptation to anthropomorphise technology is always there. We must resist the temptation to think this is anything but bias. When an algorithm, for example, correlates a black face with a gorilla, it is not that it is biased in the human sense of being a racist, namely a racist agent. The AI knows nothing of itself, it is just software. Indeed, it is merely an attempt to execute code and this sort of error is often how machine learning actually learns. Indeed, this repeated attempt at statistical optimisation lies at the very heart of what AI is. Failure is what makes it tick. The good news is that repeated failure results in improvement in machine learning, reinforcement learning, adversarial techniques and so on. It is often absolutely necessary to learn from mistakes to make progress. We need to applaud failure, not jump on the bias bandwagon.

When Google was found to stick the label of gorilla on black faces in 2015, there is no doubt that it was racist in the sense of causing offence. Rather than someone being racist in Google, or having a piece of maths that is racist in any intentional sense, this was a systems failure. The problem was spotted and Google responded within the hour. We need to recognise that technology is rarely foolproof, neither are humans. Machines do not have the cognitive checks and balances that humans have on such cultural issues but they can be changed and improved to avoid them. We need to see this as a process and not just block progress on the back of outliers. We need to accept that these are mistakes and learn from these mistakes. If mistakes are made, call them out, eliminate the errors and move on. 

Conclusion

This is an important issue being clouded by often exaggerated positions. AI is unique, in my view, in having a large number of well-funded entities, set up to research and advise on the ethical issues around AI. They are doing a good job in surfacing issues, suggesting solutions and will influence regulation and policy. But hyperbolic statements based on a few flawed meme-like cases do not solve the problems that will inevitably arise. Technology is almost always a balance, trade-offs, upsides and downsides, let’s not throw the opportunities in education away on the basis of bad arguments around bias.

This is especially true in the learning game, where datasets tend to be quite small, for example in adaptive learning. It gets to be a greater problem when using GenAI models for learning, where the data set is massive. Nevertheless, I think that the ability of AI to be blind to gender, race, sexuality and social class is more likely, in learning, to make it less biased than humans. We need to be careful when it comes to making decisions that humans often make, but at the level of learning engagement, support and other applications, there’s lots of low hanging fruit that need be of little ethical concern.


No comments: