AI bias — a good or bad thing?
Bias is neither good nor bad without reference to ethics, values and culture
Fears of social problems accompany any new technology. Not just new problems, as technology can reinstate old problems, or perpetuate existing ones.
This is certainly the case for concerns about computer-based decision making – popularly and loosely termed “algorithms”. Computers are playing an increasing role in our lives, fuelled in part by recent advances in AI. Clearly, there can be tragic consequences when computer decision-making goes awry. The recent crashes of Boeing 737 MAX aircraft have been attributed to such problems, and to complex interactions between computer and human decision making. So, it makes sense that the quality of computer decision-making receives wider attention.
Another concern is that the conscious or unconscious biases of algorithm creators may be baked into computer decision-making, to the detriment of identifiable groups of people. This post explores that concern. But first, let me cover some background.
What is an algorithm?
“Algorithm” means a specified process or a set of rules to be followed to achieve an outcome or solve a class of problems. While the word dates back to the 17th Century, its use picked up after the invention of computers in the 1950s.
Indeed, Algol, short for “algorithmic language” was the name of an early computer language. I used it briefly in the late 70s and early 80s; but can’t say that I remember it fondly.
There is nothing about an algorithm that requires a computer. Knitting patterns and checklists are both algorithms. As are laws and regulations — the “computer” that interprets them is the judiciary. Indeed, laws, rules and processes are the lifeblood of government and a civilised society, and we should have no reason to fear them in principle.
Computer decision making: algorithms, test data & human checking
Traditional computer programming — still by far the most common variety — relies on a human to pre-specify what the computer is to do with every combination of data it might encounter. That pre-specification is the algorithm, which in many cases equates to a computer “program”.
But how does one know if a program is doing what its author intended it to do? The answer is testing. This involves creating or obtaining test data, feeding it into the program and checking that it produces the desired output. The selection of test data and the quality of checking clearly matter! And both selection and checking rely on humans.
A program can fail — in the sense of not achieving its desired goal — if it encounters data outside the range on which it was tested. Such a testing failure was behind the 1996 explosion of an Ariane 5 rocket, at a cost of US$370m.
AI & the importance of training data
While AI has a long history, recent advances have made the term largely synonymous with machine learning (ML). Data is more important than pre-specified instructions for ML. To create an ML decision-making system, one needs training data, test data and general-purpose programs. Essentially, the training data is drip-fed into general-purpose programs, creating a new dataset that has “learnt” from the training data. The “learnt” dataset, combined with further general-purpose programs, constitutes an “algorithm” that can then make decisions about previously unseen data. Because ML systems evolve to fit the data used to train them, they tend to perform poorly if they encounter data outside the range of that they were trained on.
ML algorithms are somewhat of a black box, making test data selection and human checking all the more important.
Computer vs. human decision makers
Computers are designed to follow instructions faithfully and reliably. This means that an algorithm presented with the same data will always produce the same decision.1 A computer decision will be no more – and no less – biased than the algorithm it follows.
Human decisions, on the other hand, are less consistent. Humans can and do take additional data into account. This is both an advantage and a disadvantage. Humans are better at dealing with unusual or unexpected circumstances. But they have subtle and not-so subtle biases that might adversely affect their decisions.
Computer decisions are potentially more transparent. It is possible (at least in theory) to audit a computer algorithm to see what data did and did not affect a decision. But there’s no way to get inside the head of a human decision maker and fully understand their biases. For this reason, courts in some places are experimenting with computer tools to identify or inform the decisions of judges that might be tainted by unconscious bias.
Computer decision bias can arise from many sources
Computer decision bias can arise for many reasons: deliberate, accidental or subtle. Here’s a selection of reasons, by no means comprehensive.
Deliberate bias. New Zealand law, for example, distinguishes between people based on age, sex, marital status, ethnicity and citizenship. An algorithm that followed the law and discriminated between people on those characteristics would be legally correct — whether or not that bias was ethically desirable.
Accidental bias. People make genuine mistakes in the specification, implementation and testing of algorithms. Transparency and improved testing can reduce but not eliminate such mistakes. Genuine mistakes are just as likely to favour as disfavour members of specific groups; however those disfavouring the interests of the implementers and testers are perhaps more likely to be noticed and repaired earlier.
Inadequate or non-representative training and testing data. A face-recognition algorithm trained solely on New Zealand faces, for example, might perform poorly in China – and vice versa. The world’s AI research and commercialisation is increasingly centred in a small number of cities in China and the US. Algorithms developed in those places may be over-trained on locally representative data and under-trained for other parts of the world.
Historical bias. Let’s say that a bank uses data on who received a housing loan in the past to train a new algorithm approving loan applications. If the previous criteria — explicit or hidden — were biased against, for example, the self-employed, then the new algorithm will perpetuate that bias. Perpetuation is, of course, no worse than the current situation. But it is a forgone opportunity for improvement.
Unexpected bias. Bias can arise even in systems actively designed to exclude it. For example, economist Catherine Tucker described an algorithm-driven online job advertising campaign designed to be gender neutral. Unexpectedly, it ended up showing more ads to men than to women. The underlying cause was the fact that advertisers are willing to pay more for women’s eyeballs than men’s, and the algorithm was trying make its budget go further.
For advertisers, women are often more profitable — they control much household expenditure. The price of advertising to a woman is set in the consumer goods market, not the jobs market. Women are less likely to see an intended gender-neutral job ad because consumer product ads crowd them out.2
Context-dependent bias. Try an image search for “CEO NZ”. Most of the resulting images are of middle-aged and older males.3 This might be an unbiased depiction of reality. But arguably it is socially biased, as it might discourage young people and women from aspiring to become CEOs. There is a tension here between reality and a desirable future. Should search results be biased away from reality? Who gets to decide?
Cultural bias. The promise — as yet unrealised — of driverless vehicles has created a new class of ethical conundrums. One is what should a driverless car do if it encounters pedestrians and has only two feasible options, one of which would harm an old person, the other a child? An AI designer, anticipating this conundrum, can bias the decision one way or the other. But what bias should they choose? Or should making that choice be someone else’s responsibility?
Some argue that the ethical answer varies by culture, as different cultures place different relative weights on the value of older and younger people to society. Should culture and country coincide, the answer is straightforward. But what is the best decision in a country with multiple cultures? Should the decision be the responsibility of the owner, the manufacturer, safety regulators, or Parliament? There are no easy answers to these questions.
Human cognitive biases. Stereotyping (ie, expecting a member of a group to have certain characteristics without having actual information about that individual) and ingroup bias (ie, the tendency for people to give preferential treatment to others they perceive to be members of their own groups) receive much attention. But there are plenty of others to worry about! See Wikipedia for an exhaustive list of human biases.
Quick & slow fixes
Many are concerned that the creation of algorithms, and AI in particular, is concentrated in a few countries and may be overly influenced by the dominant cultures of those countries. Others note the relative under-representation of women and minorities in algorithm creation. Addressing these factors might deal with some of the causes of bias discussed above. However, doing so will take a long time and will, in my view, be insufficient to address the problem. For example, finding and addressing the biases listed above requires diversity of skills and the deep application of those skills, more than it does diversity of culture.
As computer decisions become more important to our individual and collective lives, so will transparency, audit and appeal rights. I think the public and private institutions providing these, and the incentives for transparent testing and reporting, that will prove most effective in separating “good” bias from “bad”, and in discouraging the latter.
The 2020 Algorithm Charter for Aotearoa New Zealand attempts to deal with these issues in the context of algorithms used by government agencies. The 2021 review of the charter echoes many of the themes in this post, including:
confusion as to what should be considered an algorithm;
fragmented and incomplete public reporting of algorithm use;
a shortage of expertise, including in the measurement of bias; and
limited capacity and capability within and outside government.
Bias is all but inevitable
Algorithms, whether or not they involve computers and AI, have the potential to be biased. Indeed, bias is all but inevitable. However:
Algorithmic bias can arise by purpose or by accident, with ethically sound or dubious intent, and have positive or negative consequences.
Algorithmic bias is neither good nor bad without reference to ethics, values and culture.
Perpetuation of a historical bias by a computer is no worse than the current situation. But it is a forgone opportunity for improvement.
As computer decisions become more important to our individual and collective lives, so will transparency, audit and appeal rights.
By Dave Heatley
For a bit of fun, check out Can you make AI fairer than a judge? Play our courtroom algorithm game. It’s a neat demonstration of the complexities of algorithmic bias.
Adapted from Biased algorithms – a good or bad thing? FutureWorkNZ blog, New Zealand Productivity Commission. 2 October 2019.
Chatbots such as ChatGPT introduce randomness into their responses, to make them appear more human-like. Such randomness — technically referred to as “temperature” — can be turned off to make outputs consistent for testing purposes.
Tucker, C. E. (2018). Privacy, Algorithms and Artificial Intelligence. In The Economics of Artificial Intelligence: An Agenda.
Then again, try a Google image search for “prime ministers of nz”. The results are overwhelmingly female.
Thanks Dave - interesting read! This paper 'Algorithmic Fairness and Economics', also co-authored by Tucker, is interesting https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3361280 ... Section 5.1 'Benchmarking vs Human Judgment' suggests algorithms are generally less bias than human decision makers, although, as the paper says, that's probably not a good benchmark