Can AI Be More Moral Than Humans? DeepMind’s Co-Founder Thinks So.
DeepMind has been thinking about AI ethics, specifically about the idea that AI could become more moral than us1. I think this is a good thing. Deepmind recently (2025/12/12) released a video featuring co-founder Shane Legg discussing these things.
At timepoint 19:15 Hannah Fry asks Shane about how ethics comes into all this. Shane Legg discusses whether AI can understand ethics, take robustly safe actions based on this actions in a way that we can trust. He discusses how chain of thought (CoT) reasoning is observable.2 How instincts and reasoned analysis can diverge.
“If we can make that reasoning really really tight, and it has a really strong understanding of some ethics and morals that we want it to adhere to, I think it should in principle actually become more ethical than people, because it can more consistently apply and reason at, maybe superhuman level the choices that it is faced with and so on.”
Shane Legg – The arrival of AGI
My monkey brain feels somewhat vindicated in that the co-founder of arguably the most powerful AI company on the planet has just come out arguing that AI could become more moral than humans – and that we should steer superintelligence to become super ethical. This topic has been a main focus of this blog for some time.
Shane discusses the need to make Superintelligence super ethical AI at time 36:27 – 37:31 – the thrust of it is, as AI will surpass human capability, and as it becomes a better at reasoning – we need to focus on what he refers to as ‘system two safety’3. Assuming that because of competitive dynamics (globally)4 and other factors, we can’t stop the development of Superintelligence – then we need to think hard about how to make Superintelligence ethical – in a way that as AI scales in capability, we can harness this not to just achieve certain goals, but to have it apply to ethics as well – so that we can have AI ethical capability scales along with AIs general capability.
- Something I’ve considered a worthy topic for a long time – I’ve written about it in many blog posts. ↩︎
- CoT (Chain of Thought) reasoning is unlike human “gut instinct” (which is a black box), CoT reasoning is printed out in text. We can actually audit the AI’s moral reasoning to see if it’s valid. This is a huge safety feature. Note that we should be careful in assuming the rendered CoT reasoning text may not faithfully represent what the AI is actually thinking. ↩︎
- Shane Legg explicitly references Daniel Kahneman’s ‘Thinking, Fast and Slow‘ – System 1 is our fast, instinctive, emotional brain (often prone to bias), while System 2 is slower, deliberative, logical reasoning. It explains why AI could be better. Humans often react with System 1 (anger, bias, fear). An AI forced to use “System 2” (Chain of Thought) for ethical decisions would technically be “thinking” more carefully than a human reacting in the moment. ↩︎
- “AI doesn’t need a moustache-twirling villain to go wrong – it just needs the wrong metaethics in an unforgiving game.” – see ‘AI Ethics in the Shadow of Moloch: Why Metaethical Foundations Matter‘ ↩︎
