Superalignment and Indirect Normativity
in 2023, OpenAI introduced their idea of superalignment here. It seems in some ways similar to Iterated Distillation and Amplification (IDA) (Paul Christiano): “Our goal is to build a roughly human-level automated alignment researcher. We can then use vast amounts of compute to scale our efforts, and iteratively align superintelligence.” But superalignment failed to materialise…