Early Philosophical Groundwork for Indirect Normativity
|

Early Philosophical Groundwork for Indirect Normativity

Note: this is by no means exhaustive. it’s focus is on western philosophy. I’ll expand and categorise more later… The earliest precursors to indirect normativity can be traced back to early philosophical discussions on how to ground moral decision-making in processes or frameworks rather than specific, static directives. While Nick Bostrom’s work on indirect normativity…

Understanding V-Risk: Navigating the Complex Landscape of Value in AI
|

Understanding V-Risk: Navigating the Complex Landscape of Value in AI

In this post I explore what I explore what I broadly define as V-Risk (Value Risk), which I think is a critical and underrepresented concept in general, but especially for the alignment of artificial intelligence. There are two main areas of AI alignment: capability control and motivation selection. Values are what motivates approaches to fulfilling…

Will Superintelligence solely be Motivated by Brute Self-Interest?
|

Will Superintelligence solely be Motivated by Brute Self-Interest?

Is self-interest by necessity the default motivation for all agents?Will Superintelligence necessarily become a narcissistic utility monster? No, self-interest is not necessarily the default motivation of all agents. While self-interest is a common and often foundational drive in many agents (biological or artificial), other motivations can and do arise, either naturally or through design, depending…

Reverse Wireheading
|

Reverse Wireheading

Concerning sentient AI, we would like to avoid unnecessary suffering in artificial systems. It’s hard for biological systems like humans to turn off suffering without appropriate pharmacology i.e. from aspirin to anesthetics etc. AI may be able to self administer pain killers – a kind of wireheading in reverse. Similarly to wireheading in AI systems…

Understanding the moral status of digital minds requires a mature understanding of sentience
|

Understanding the moral status of digital minds requires a mature understanding of sentience

Turning off all AI won’t happen in the real world. If we understand the signatures of sentience, we’ll be in a better position to know what to do to circumstantially prevent/mitigate it or encourage it. AI, esp LLMs ‘claiming sentience’ isn’t enough… We need to deep operational understandings of it. See the article by 80k…

VRisk – Value Risk
|

VRisk – Value Risk

Value Risk (vrisk): The risk of bad to sub-optimal values happening. An obviously bad vrisk is one where existential risk (xrisk) is high. Existential here could mean extinction of life/species/sentience etc, or the universe becoming uninhabitable to the degree where there is no chance of new life emerging to experience anything. Extinction means local annihilation…

Indirect Normativity
|

Indirect Normativity

Alignment Challenges Given the critical role of ethics in AI safety, it’s deeply concerning to see such significant disagreement among experts who have rigorously studied ethics. The divergence in moral and meta-ethical perspectives among these experts poses a serious question: How can we effectively align AI if the very foundations of ethical understanding are not…