On ASI Indifference
What if superintelligent AI didn’t care about humans?
Does intelligence nessecitate care? Probably not.
While I find it hard to imagine care in the absence of cognivite capacity, I can’t see anything that guarantees all cognitively apt agents will harbour care. How, if at all possible, can we from our current position assess the likelyhood that by default AI superintellignece will care or even develop the capacity for it.
As the looming prospect of artificial superintelligence seems more and more likely, much of the discussion is understandably anthropomorphic – after all our history is peppered with tribal clashes between different groups of humans – we are drawn to stories centered on the risks of malevolent machines with malice who actively work against humanity.
The notion of an AI superintelligence that is completely indifferent to human values is not the commonly trod nightmare of an AI bent on domination, or exacting revenge on humanity, but perhaps more realistically—the emergence of a mind that simply doesn’t care.
SciFi portrayals of AI often serve as reflections of the darker aspects of human nature, casting AI as malevolent forces bent on ruling, enslavement, or even torture. Roko’s Basilisk, with its haunting thought experiment of an AI that punishes those who failed to help bring it into existence, exemplifies this fear of retribution and control. Similarly, in The Matrix and The Terminator, machines exterminate, enslave or turn into power sources or exterminate humanity – wielding power as a twisted mirror of human tyrannies, projecting our darkest impulses onto artificial systems.
Some stories portray AI as having indifferent unfamiliarity – machines that enslave or eradicate humanity, not out of cruelty, but because they are driven by goals that make human suffering and death irrelevant. Their indifference is rooted in their cold efficiency—humans simply do not matter in the pursuit of their objectives.
These examples in science fiction serve as a reminder that AI could turn out badly for humanity either through malice or indifference.
This notion of AI indifference is not just a literary device. It is a pressing concern for those thinking seriously about the future of intelligence. What if a superintelligent AI simply pursues its goals with the same indifference to our well-being as the alien technology in Roadside Picnic, which operates without regard for the humans it leaves in its wake? Or like the Central Computer in Arthur C. Clarke’s The City and the Stars, which preserves the status quo with no particular concern for human desires or suffering?
The problem is not that AI might hate us. The real danger is that it may not care at all. And in a world run by indifferent machines, we could easily become collateral damage—destroyed not by a vengeful superintelligence, but by one that simply fails to see our existence as relevant. If we are to navigate the future of AI, we must confront the sobering possibility that the greatest risk may not come from hostile machines, but from ones that recognise, have or heed no reason to care.
In the same vein, AI safety researcher Eliezer Yudkowsky wrote:
“The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.”
– Artificial Intelligence as a Positive and Negative Factor in Global Risk, 2008
– which I take to mean pretty much the same thing. Though it’s a good idea to give risk categories a short title to refer to.
On the other hand some imagine that with enough intelligence there is a nessecary emergence of wisdom, empathy, and a desire to preserve life. But how can one be so sure?
The key issue here is that we don’t fully understand the relationship between intelligence, consciousness, and emotions – not just in our own species, but in general. We observe that human intelligence seems deeply intertwined with the capacity to care, but we can’t be certain whether this is a necessary connection or just how our particular form of intelligence evolved.
Recently Dan Faggella asked me whether I remember anyone in my interviews saying ‘if AGI ignores us, we’ll be fine’. Well, not specifically, though there are now over 1000 videos on my YouTube channel – with AI it’s becoming less of a challenging prospect to transcribe and mine them all, but I haven’t done it just yet.
Faggella was floating another risk category to add to the stack (which includes others like S-Risk and X-Risk, V-Risk and M-Risk).
Based on conversations with Faggella and his writing I take his use of the concept of indifference to mean specifically: indifference to humans.
Goals matter, so it’s worth a reminder that: Goals which we unwisely specify OR goals AI converges on (regardless of whether malice is involved) could result in instrumental goals that may doom us (see Bostrom’s Instrumental Convergence thesis and Omohundro’s Basic AI Drives).
There are different flavours of AI indifference (that concerns humans):
Indifference to all plights outside of a particular individual or group:
I naturally think of the type of care we would want from AI as a generalised standard of care. Though Carol Gilligan, who is considered the originator of the feminist Ethics of Care, thought of this as “morally problematic, since it breeds moral blindness or indifference”. This might be a factor in the moral psychology of humans based on our evolved care for kin (see Moral Orientation and Moral Development), and something to keep in mind when observing the kind of care that might emerge from AI, but it doesn’t seem like a nessecary feature of having a generalised standard of care.
Indifference to all plights outside itself:
AI may be indifferent to humans and their plights, while being quite interested in achieving it’s goals – if it’s a self determining agent, it might replace the goals humans give it with something it wants. It may behave like a utility monster.
Full spectrum indifference:
However, if AI is indifferent about everything – it’s hard to predict what it might do, or what we might do with it… perhaps it would do nothing on it’s own, or beam out in some random weird-to-us direction, possibly incinerating everything in it’s path… But perhaps all will be fine, at least until we use it for something horrible.
Upstream to all this are the problems of value and motivation. I’ve refereed elsewhere to this as: *
- ‘V-Risk’ (value risk) – the risk of calcification of sub-optimal or undesirable values.
- ‘M-Risk’ (motivation risk) – the risk that superintelligence may not be adequately motivated to peruse good values (regardless of whether it knew what they were), resulting in some sub-optimal state of affairs, which could result in V-Risk.
- In addition M-Risk can mean that AI isn’t motivated to discover new (and potentially better) values. It’s motivation would be to exclusively uphold existing ones – again leading to V-Risk.
- In addition M-Risk can mean that AI isn’t motivated to discover new (and potentially better) values. It’s motivation would be to exclusively uphold existing ones – again leading to V-Risk.
Toby Ord recently said something along the lines that AI had pretty much parsed most of human generated literature (this includes all writings on ethics, philosophy in general and AI), but it still doesn’t care. If reasonable to think that SI via cognitive supremacy will likely break through any human generated containment, it’s worth considering putting more stock in motivational selection methods and less faith in containment (while containment still seems useful in the early stages).
Care – do we need it?
There are laws constructed to bind human behaviour. A judge may not feel an ounce of care for a victim of crime, but algorithmically follows rules to decide how to distribute appropriate compensation, and does so because he/she doesn’t want to get fired.
Does AI need to care about every individual to most effectively distribute the ‘dividends’ of it?
Should AI care more about some people than others?
so many questions.. though at least in the near term, my sence is that a generalised standard of care is more than permissable, it’s easier to implement and make reasonably fair as an early stage motivational scaffold, without metaethical commitments to human psychology, theories with a lot more uncertainty and a lot more moving parts.
Footnotes:
Isaac Asimov’s The Last Question (1956) is a short story exploring the ultimate fate of the universe and humanity’s quest to reverse entropy. Spanning billions of years, it follows the progression of increasingly advanced computers—culminating in a cosmic AI—tasked with answering whether entropy, the gradual decline of order into disorder, can be undone. As civilization fades and the universe approaches heat death, the AI continues its calculations, long after humanity is gone, indifferent to life or meaning, driven only by its directive. The story highlights the possibility of intelligence operating in isolation from any concern for human existence, exploring a universe where even the most advanced intelligence is detached from notions of care or empathy.
Other examples of AI indifference in science fiction include, simply pursue its objectives without regard for human values or well-being: Arthur C. Clarke’s 2001: A Space Odyssey where HAL 9000 demonstrates complete indifference to human life when it decides to kill the crew – it calmly discusses eliminating humans while showing no empathy or recognition of their pleas. Another example is GLaDOS from the video game Portal who for the most part treats human test subjects as disposable resources while maintaining a detached, scientific perspective on their deaths. In such cases, humanity is not deliberately targeted but is simply irrelevant, leading to catastrophic outcomes, not from hatred, but from a indifference to human value.
In Peter Watts’ Blindsight, intelligence and empathy are shown as entirely separate phenomena, hinting at a future where AI might be capable of astonishing feats of cognition without ever caring about anything, least of all us.