VRisk – Value Risk
Value Risk (vrisk): The risk of bad to sub-optimal values happening.
An obviously bad vrisk is one where existential risk (xrisk) is high. Existential here could mean extinction of life/species/sentience etc, or the universe becoming uninhabitable to the degree where there is no chance of new life emerging to experience anything. Extinction means local annihilation with no way for a local cluster of life (I e. An ecosystem) to claw back. Existence eradication means the fabric of the universe can no longer afford life.
Another is where the risk of large amounts of suffering could happen (srisk).
A particularly harsh variant is a value lock-in (L-Risk) – a calcification of bad values on the spectrum of super-bad to sub-optimal. Xrisks are permanent from the local standpoint. Srisks aren’t nessecarily lock-in.
Perhaps resulting in annihilation or some kind of value lock-in damning access to far better values upon which to scaffold future utopian civilisations.
In the same vein as existential risk (xrisk) and suffering risk (srisk).
Q: How these values might become “calcified”?
A: A superintelligent AI may totalize value set X, enforcing it for time and all eternity – this may be a result of overly paternalistic constraints, design flaws, value drift, or a result of premature convergence on goals that are hardcoded or unwisely chosen via direct specification.
The persistence of value calcification could be ultimately enforced via SI itself, or robust containment methods alongside unwise or unethical controlling interests.
Q: Why this would be specifically harmful?
A: If sub-optimal values were calcified (i.e. torture for fun) then there would be more suffering and less wellbeing than would otherwise occur had better values been calcified.
Also, it’s pretty bad form to calcify values unless one knows everything there is to know about value in theory, and it’s outcomes in practice – knowing everything seems impossible, even with Superintelligence.
What about knowing enough to be certain enough that unknowns won’t have a material impact on decisions X?
Q: Does vrisk require it’s own mitigation strategies distinct from those of xrisk and srisk?
A: Yes and no. See indirect normativity… I’ll expand more on this later.
Q: What if value calcification is actually desirable in some cases, such as locking in beneficial values like empathy or well-being?
A: It is imprudent to calcify value to the Nth degree because we don’t know what values we will discover that may replace or outrank existing values.
Q: Could there be trade-offs that you’re not considering?
A: Yes
Moral Risk
Note, if one a sees morals and values as the same thing, then m-risk could be a synonym for v-risk. However some see values as including ethical concerns as well as personal preferences, aesthetics and cultural ideals.
The Orthogonality Thesis implies Value Risk
According to Nick Bostrom’s Orthogonality Thesis, Superintelligent AI may converge on values inimical to human values (i.e. and as a result may be indifferent to the value we have to survive and thrive.
From an objective view, this may not be so bad if AI’s values are objectively correct and human values (at least some of them) are objectively wrong or worse.
Value Rightness
All this assumes that some kind of ‘value rightness’ exists – such that ideal agents would stance-independently converge on it. Value rightness may be equivalent to ‘moral rightness’ which Nick Boston describes in his superintelligence book chapter 13.
Moral Realism
The candidate for moral rightness I favor is moral realism of the sort which combines rational stance independence and empirical access to objective morally relevant feature.
However, the vrisk argument doesn’t require moral rightness to work, and it is certainly not required that moral realism (as I conceive of it, or in other forms more generally) to be true.
Indirect Normativity to Mitigate Value Risk
To achieve objective moral rightness we may need to work on engineering some kind of indirect normativity, possibly involving constrained Oracle AI, from which to bootstrap unconstrained Superintelligent AI.
A constrained Oracle AI may help us in further discovering the landscapes of value, and identifying pathways to objectively awesome value. This may not need to be achieved all at once. We could first achieve existential security, and go through iterations of long reflections in order to have a clear idea of what the landscape of value looks like, finding the edges of our understanding of value and working outwards from there.
If superintelligence were locked in to totalizing some calcified suboptimal value, this could be curtail the realisation of objectively better values, and nullify the possibility of achieving an/the optimal value system.
Risk Hierarchy & Priority
Risk Hierarchy and Priority: In a taxonomy of possible risk types, value risk (VRisk) may sit rather high up as it may be causally upstream from other risk types. Malignant values could increase suffering risks if the values tolerate or even promote unnecessary harm (i.e. torture is permissible as long as it’s fun for the torturer), or even extinction (i.e. suffering is so very bad such that our overriding obligation is to ensure the permanent extinction all life capable of suffering – a vacuum phase collapse should suffice).
Notes
V-Risk is not to be confused with:
- Value at risk (VaR): A financial metric that estimates how much a portfolio could lose over a given time period. VaR is used by banks and financial institutions to assess the risk and profitability of investments.
- Value of risk (VOR): The financial benefit that a risk-taking activity will bring to an organization’s stakeholders. VOR requires a company to examine the components of the cost of risk and treat them as an investment option.
- Moral hazard: In economics, a moral hazard is a situation where an economic actor has an incentive to increase its exposure to risk because it does not bear the full costs of that risk.