Let me know what you think! I'm still figuring out how to summarize my research program and instill confidence I’ll be a “productive” researcher.
Throughout my academic career, I’ve been worried about value capture. Here’s how C. Thi Nguyen describes it:
1. Our values are, at first, rich and subtle. 2. We encounter simplified (often quantified) versions of those values. 3. Those simplified versions take the place of our richer values in our reasoning and motivation. 4. Our lives get worse. (Games: Agency as Art, 201)
In value capture, we become so enamored with simplified representations of our values that our deeper values suffer. If I start to care too much about maintaining a target bodyweight, I will systematically prioritize that number over my own physical and mental health.
Initially, the dangers of value capture were most apparent to me in the domain of egalitarian respect for persons:
In “What’s the Appropriate Target of Allocative Justification?” Zara Anwarzai and I suggest that trying to maximize the simplified metric of “Quality-Adjusted Life Years” overlooks our deeper value of showing respect for patients.
In “How to Read a Riot,” I argue that over-reliance on the simplified metric of “concrete policy changes achieved” overlooks the deeper expressive value of political riots: the ability to demand urgent recognition of systematic oppression.
And in my dissertation, I clarify how the asshole prioritizes the simplified metric of “what’s in my self-interest” at the expense of what’s good for his well-being, a much deeper and more interconnected notion.
More recently, I’ve noticed that problems of value capture deeply infect AI ethics. Virtually the whole AI safety literature rests on a remarkably impoverished notion of what well-being consists in, and therefore, what might count as potential benefits or harms. Worse, when we work hard to maximize these simplified markers of well-being, we end up pitting shallow representations of well-being against well-being itself.
I’m working on a series of papers and a couple books to start unpicking this problem from a variety of angles:
1. “The Objective Short List”
Objective List Theorists hold that some things are objectively good or bad for you, whether you want them or not. But how many kinds of things are on The Objective List? If we say 0, we’re simply nihilists about well-being. If we say 1, we’re monists, and our position looks suspiciously like the supposedly distinct Hedonistic Theory (according to which only happiness matters). If we say 2, we’re basically John Stuart Mill with higher and lower pleasures. But methodologically, why wouldn’t you end up with a zillion objective goods? Would there even be countably many? At stake here are deep questions about the incommensurability of our values. Nguyen frames the problem of value capture by noting that our values are better because they’re rich and inchoate, but there’s much more to say about why this is so.
I argue that that proper epistemic humility about our competence in value theory should lead us to be precautionary pluralists about value. We’re very limited creatures in a very confusing world, and in the West, we’ve only just started to think about ethics from a more secular and inclusive perspective. So we don’t know our values well enough yet to be confident that they’re all, secretly, of the same kind. Instead, we should begin with more the open-minded assumption that there are many different kinds of incommensurable value.
2. “Superintelligence and Suicide”
In the Grasshopper’s Utopia, no one has to work: “the Computer in Charge or…God or whatever” attends to all our needs. (The Grasshopper: Games, Life and Utopia, 174) Now that there’s nothing you have to do, the Grasshopper argues you’d spend all your time playing games—that is, voluntarily attempting to overcome unnecessary obstacles. (41) Craving meaningful work, some disgruntled Utopians destroy the Computer in Charge so they will once again have to work in order to survive.
Utopia’s one thing, but here in the real world, we face a human alignment problem: Our values will have to change in response to AI. For example, how we value work and play will have to change as AI grows more effective. Right now our sense of purpose, our identities, and our livelihoods are tied up in our work. How will we respond as AI begins to render more and more of us structurally unemployable? Fortunately, I think resources for beginning to rethink the value of work and play are close at hand. All obstacles are, in some sense, voluntary, given the fact that we can kill ourselves at any time. Every obstacle in your life is, against this backdrop, taken on unnecessarily. (Even if suicide is morally forbidden, it’s no less practically possible.) And yet, this practical sense in which life is just a game we’re playing in no way undercuts the value of our lives. So the value of work and play should not be understood in terms of their practical necessity or lack thereof, but in terms of their contributions to flourishing.
3. “Should We Expect a Wisdom Explosion?”
It’s broadly accepted in the AI literature that we should expect an intelligence explosion, that is, a point at which AI becomes so effective at improving itself that it becomes hypereffective within a short period of time. Intelligence, in this conversation, is understood as effectiveness in achieving a wide variety of goals. Quite distinct is wisdom, which we can gloss as (roughly) effectiveness at choosing and pursuing worthwhile goals. But according to the widely-accepted Orthogonality Thesis, an agent’s instrumental effectiveness and ultimate goals are unrelated, so a foolish AI might very effectively turn the whole galaxy into paperclips.
But I wonder whether we should expect a wisdom explosion, a point at which AI becomes so wise that further reflection makes it hyperwise within a short period of time. If so, should we expect to recognize hyperwisdom for what it is? Given the unfinished nature of our own values, are we really wise enough to recognize our own commitments in improved form? Or would we confuse, say, an AI’s moral progress for moral confusion?
4. How to Think about Values
This is a short popular philosophy book where I’m trying to encourage a more epistemically humble, inclusive, and pluralistic approach to value theory and put it to work tackling real-world AI ethics problems. I’ve found that technical experts, business leaders, and even AI ethicists often tacitly accept that all value conflicts are disguised optimization problems, that is, that our values are all commensurable and can be tallied up, at least in principle. On such a view, there’s always one (or a set of) optimal decisions.
We need a public-facing resource that helps non-philosophers think about values more carefully than this. And I think we need to start really low to the ground. So I start with the fundamental distinction between instrumental and ultimate value, and show how vast, plural, and delicate our judgments about our ultimate values really are. In the end, I argue that reductive, optimizing approaches tend to instantiate value capture on the deeper values we actually hold.
5. We're All Virtue Theorists about Basketball
I was a basketball scout before coming to grad school for philosophy, and I’m always working on philosophy of games and sports. So here’s the book on basketball lurking in the back of my mind, where I bring together the normative toolboxes of our ethical and athletic practices. Even in a competitive points-driven game like basketball, our underlying values are deeply plural and incommensurable. Winning isn’t the only thing that counts.
We value other forms of achievement, like breaking records (LeBron passing Kareem in total points), overcoming adversity (Klay Thompson’s return from injury), impacting the broader culture (Allen Iverson being Allen Iverson), and even changing the game itself (Steph’s huge effect on how the next generation plays). And we value much more than achievement in sports—for example, we care about entertainment, skill, grace, character, and equity. Beyond how difficult his dunks were, we revere Vince Carter’s effortless majesty in the air.
So our assessments of value in basketball aren’t simply a matter for skilled optimization; they require a broader wisdom in assessing how to sacrifice our incommensurable values against one another when they inevitably come into conflict. In developing this practical sensitivity, players, coaches, and fans of basketball are all practicing virtue theorists about basketball. They might be surprised how much they can learn from philosophers, and we might be surprised how much we can learn from them.