Superalignment and Indirect Normativity

Adam Ford 2025-10-072025-10-07

in 2023, OpenAI introduced their idea of superalignment here. It seems in some ways similar to Iterated Distillation and Amplification (Paul Christiano):
“Our goal is to build a roughly human-level automated alignment researcher⁠. We can then use vast amounts of compute to scale our efforts, and iteratively align superintelligence.”

MIRI (Eliezer Yudkowsky & Nate Sores) don’t think superalignment will work, and for the same reasons, probably don’t think indirect normativity will work.

In their book ‘If Anyone Builds It, Everyone Dies’ they say:

“In the case of weak superalignment: We agree that a relatively unintelligent AI could help with “interpretability research,” as it’s called. But learning to read some of an AI’s mind is not a plan for aligning it, any more than learning what’s going on inside atoms is a plan for making
a nuclear reactor that doesn’t melt down.
We consider interpretability researchers to be heroes, and do not mean to degrade their work when we say: It’s not a good sign, when you ask an engineer what their safety plan is, and they start telling you about their plans to build the tools that will give them a better window into
what the heck is going on inside the device they’re trying to control.
And even if the tools existed, being able to see problems is not the same as being able to fix them. The ability to read some of an AI’s thoughts, and see that it’s plotting to escape, is not the same as the ability to make a new AI that doesn’t want to escape. That might not be possible without a full solution to the alignment problem: Insofar as the AI has weird alien preferences, escape is in fact the course of action that best fulfills its objectives. Attempts to escape
are not a weird personality quirk that an engineer could rip out if only they could see what was going on inside; they’re generated by the same dispositions and capabilities that the AI uses to reason, to uncover truths about the world, to succeed in its pursuits.”

Article | Event | Video

The Biohappiness Revolution – David Pearce
ByAdam Ford 2021-11-202025-03-19

Philosopher David Pearce discusses the Biohappiness Revolution, and his forthcoming book. What is health? According to WHO: “Health is a state of complete physical, mental and social well-being and not merely the absence of disease or infirmity.” Further on the rights to health: “The enjoyment of the highest attainable standard of health is one of…

Read More The Biohappiness Revolution – David Pearce
Article | Interview | News

On Consciousness, Qualia, Valence & Intelligence with Mike Johnson
ByAdam Ford 2018-10-272018-10-31

Andrés Gómez Emilsson joined in to add very insightful questions for a 3 part interview series with Mike Johnson, covering the relationship of metaphysics to qualia/consciousness/hedonic valence, and defining their terms, whether panpsychism matters, increasing sensitivity to bliss, valence variance, Effective Altruism, cause prioritization, and the importance of consciousness/valence research . Andrés Gómez Emilsson interviews…

Read More On Consciousness, Qualia, Valence & Intelligence with Mike Johnson
Article | Interview | Video

Anders Sandberg: Scary Futures Tier List – Halloween Special
ByAdam Ford 2025-10-312025-11-11

Halloween special: a scary futures tier list that is spooky in theme & sobering in content. This tier list isn’t scientific, it isn’t the final say, it doesn’t exhaustively include all doomsday risks – its a bit of a gimmick, and a fun intuition pump. Anders Sandberg is a neuroscientist and futurist well known for…

Read More Anders Sandberg: Scary Futures Tier List – Halloween Special
Article | Interview | Media | Video

Jamais Cascio – The Future and You! Security, Privacy, AI, Geoengineering
ByAdam Ford 2016-02-192016-03-08

Jamais Cascio discusses the Participatory Panopticon, Privacy & Secrecy, the ramifications of Disconnecting from the Chorus, what it means to be a Futurist, the Arc of Human Evolution, Artificial Intelligence, the Need for Meaning, Building Agents to Listen to Us, WorldChanging.com / OpenTheFuture.com, Geoengineering and the Viridian Green movement. We pollute our data-streams, to control…

Read More Jamais Cascio – The Future and You! Security, Privacy, AI, Geoengineering
Article | Interview | Interview | Media | Video

Cognitive Biases & In-Group Convergences – Joscha Bach
ByAdam Ford 2020-08-112020-08-11

Joscha Bach discusses biases in group think. Discussion points: – In-group convergence: thinking in true & false vs right & wrong – The group mind may be more stupid than the smartest individuals in the group Joscha Bach, Ph.D. is an AI researcher who worked and published about cognitive architectures, mental representation, emotion, social modeling,…

Read More Cognitive Biases & In-Group Convergences – Joscha Bach
Article | Video

Kristian Rönn – The Darwinian Trap – Interview with SciFuture
ByAdam Ford 2024-12-272025-02-16

Kristian Rönn discusses his amazing book The Darwinian Trap – Macroeconomics & Game theoretic Imagineering: avoiding the minefields of the tragedy of the common in the hope that we can stave off the onslaught of Darwinian Demons, and ultimately find some nash equilibrium of fairness for all humanity, sentient AI an non-human animals. The book…

Read More Kristian Rönn – The Darwinian Trap – Interview with SciFuture

Similar Posts

Leave a Reply Cancel reply