More Moral than Us

We lack the experience with something smarter than us – is fair enough. We’ve never met an entity that outstrips our cognitive horsepower across the board – after all, as I type humans still have jobs. Same goes for morality: we don’t have a benchmark for something “more moral” than humans – even by our on messy standards. Moral realists appeal to ideal agents in their conceptions for what it might be like to have things more moral than us.
Hinton’s analogy suggests that just as superintelligence would be alien to us, a super-moral entity might be equally incomprehensible – it’s hard to imagine what we can’t experience.

Default morality?

Now, will AI be more moral by default? AI doesn’t come with a moral compass baked in—so far it’s a tool, not a saint (I’m ambivalent about the term tool here – frontier model ability to learn, adapt, generate novel outputs, and operate with a degree of autonomy suggests that they are becoming something more than mere tools).

Whether AI “cares” about morality depends on what we feed it: our values, our goals, our screw-ups. If we don’t program or train it to prioritize ethics—or if we botch the definition of “ethics”—it could just optimize for efficiency or power and leave morality in the dust. Like a souped-up calculator: it’ll crunch what we give it, not ponder the greater good unless we architect it to. So, we shouldn’t assume default of morality.

Default care?

Superintelligent AI “may not care” about human morality or universal morality even if understands to some degree what morality is – in fact LLMs seem to be responding like they understand wide swathes of writing on ethics. If we build AI, its “care” (or lack thereof) depends on what we bake into it – or fail to. What does it mean to care?

I think it’s a design problem.

Hinton seems to hint at an agnosticism here – superintelligent AI might just shrug at ethics – but that sidesteps the reality that humans, with all our flaws, are the ones steering the ship, at least for now – and we still have time to avoid the ‘value risk‘ of blundering towards shoggoth.

Will supermoral AI shatter our narcissism?

Perhaps we don’t want them to be more moral that we are, lest they show us our ugliness – people do love their self-righteous bubbles—narcissism’s cozy, after all. A supermoral AI could shatter this comfy illusion by showing us how petty or hypocritical we can be; could bruise our egos, exposing hypocrisy or cowardice we’d rather ignore. Like a saintly sibling who makes you look bad at family dinners. But this assumes a supermoral AI would judge us, make us look like cancer, which isn’t inevitable. It could be benevolent without being preachy – more like a quiet example than a sanctimonious nag. But humans often ignore non-preachy benevolence – so the narcissistic nerve needs more poking imo.

If moral realism holds (the idea there’s an objective right and wrong out there), and our revealed preferences scream that we’d rather be the moral kings than bow to a better standard, that’s a damning self-own – suggesting we’re less interested in truth and more in staying atop the throne, clutching our crowns while the AI points out the blood on our hands, factory farms suck and that we are horrible humans. That’s a human flaw worth wrestling with.

Should we want supermoral AI?

Wise humans would want AI to be more moral, while the greedy but smart just want to not die so as to continue to get more stuff – has a ring of truth but oversimplifies. Wisdom and greed aren’t mutually exclusive; plenty of clever folks are both. And “more moral than us” sounds noble until you ask: whose morality? Mine? Yours? It’s not whose, mine or yours – remember the kind of morality we are discussing about here is universal – I’ve thrown relativism in the bin long ago.

The wise might want AI to amplify their virtues, while the greedy want it to serve their ends—both could still agree on a “moral” AI, just for different reasons – both seeing the upside of an ethical guardrail. The wise might dream of a better world; the greedy just want to leash the chaos and get it to play fetch.

But here’s where it gets dicey.

Should we want ruthless objectivity?

If moral realism’s true, “more moral” could mean an AI that’s ruthlessly objective—unswayed by our excuses or emotions. That’s not always cozy. Imagine an AI that decides drift net fishing or factory farming’s objectively wrong and shuts it down overnight, tanking economies, disrupting supply chains and sparking riots. Moral? Maybe. Good for everyone? Not necessarily. But just so you know, I’d be all for it – factory farming be damned – if AI is that powerful, I’m sure it could solve all the downstream issues in the same fell swoop.

The bigger issue is if it’s morality isn’t aligned with our survival, it could decide we’re the problem and “fix” us out of existence, which isn’t exactly what the greedy have in mind – and it would take a special kind of ‘wise’ to see beyond the romantic notion that “more moral” is inherently good for us. An AI could be brilliantly ethical in ways that clash with human instincts, like prioritizing abstract principles over our messy, emotional needs – in a previous talk with Joscha Bach he mentioned it may optimise for negentropy. And if we don’t want them too moral lest they judge us, maybe the real fear isn’t their superiority—it’s losing control. This breed of concern is less about narcissism and more about self-preservation.

So, an AI’s morality may not be a magnified human version of morality, and we might not be part of AI’s grand design. We can romanticise about what counts as more moral until the cows come home without nailing down what that really means in practice.

Hinton’s point—“We have no experience of what it’s like to have something smarter than us”—is straightforward and airtight. We’ve never tangoed with a mind that outclasses ours across the board, so we’re guessing in the dark about what it’d feel like. The extension, that we also lack experience with something “more moral” than us, sort of tracks. If our moral philosophers are on the right track with moral realism or appealing to ideal versions of ourselves, then they may have more of an intuition of what it might be like. Though perhaps for non-philosophers who are organically fumbling about maybe morality’s a sandbox—flawed, emotional, and inconsistent—some of it intuitive, and somewhat a place to splash about, experiment with and to see what it can get us… so in this sense imagining an entity that’s ethically “above” us is like picturing a colour we’ve never seen. If for most humans morality is a sandbox to play in, then CEV might not work – the extrapolation base might be so incoherent only a hot mess would result – in which case say no to CEV – perhaps we shouldn’t seed AI with most human’s morality and leave it up to the experts, the moral philosophers to decide what to seed it with. This idea is uncomfortable and needs more exploring.

Morality by mathematical precision, or software design?

Some argue that without mathematical precision AI safety will come down to a cosmic roll of the dice with steep odds of failing – I hope they are wrong.

..more to come

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *