Twilight Imperium and Instrumental Convergence
I got to play Twilight Imperium last weekend and I was so powerful I didn’t know what to do.
I like to describe Twilight Imperium as 8 hours of space politics. We’re going to spend an entire day maneuvering and colluding and pretending we’re about to fight but then maybe not following through with it because fighting is so damn expensive.
It’s gonna be great!
Each player rules a spacefaring empire racing to score 10 victory points by achieving a variety of objectives revealed throughout the game. Maybe this time we’ll have to research multiple technologies, hold a bunch of cultural worlds, spend a lot of money on a huge monument...
Since no one knows what objectives are coming next, you have to be flexible and keep your options open so you can retool your strategy from turn to turn. A big part of that is smoothing over everyone’s concerns by making deals, even as you shift between scientist and conqueror and refined aesthete. Oops, I just scored a secret objective by destroying your flagship, here, let me pay you for it.
Twilight Imperium is one of those games that absolutely wouldn’t work virtually. It’s not just that there are over a thousand pieces. There’s something about reading people and communicating with them in person, about rolling physical dice and flipping over plastic ships as they sustain damage. Twilight Imperium, like basketball, is better in person.
I played as the Ghosts of Creuss, a hypermobile wormhole-loving race. Unfortunately, after leading for most of the game, I fumbled it away in the end.
By turn 3, I could strike from 5 systems away through my wormhole network. No one’s home system was truly safe, and I was so powerful. So I pretended I was about to invade my friend’s home system and then offered to make a deal with him. The fight has been symbolically performed, now let’s be reasonable and negotiate. Surprisingly, he decided to attack me instead.
I think this was my real downfall. I only get to play Twilight Imperium about once a year, but I never build enough ships. I’m more of a diplomat than an admiral. So I try to play super efficiently, staying adaptable with a handful of units instead of pumping out huge fleets from every space dock. But there’s a balance here, and if you’re next to an aggressive player, you’ve got to respond in kind and build up. But that’s okay, because Twilight Imperium is a game of mistakes. I also underexpanded on the first turn, activated a key system too early in the final round, misread the Ceasefire Promissory Note, didn’t use all my relevant technologies at every opportunity…in a game as complex as Twilight Imperium, you’re going be overwhelmed and play imperfectly. And that’s okay, that’s part of the fun, too. As it turns out, scoring 10 victory points before anyone else is tough work. To get there first, it helps to adopt certain instrumental subgoals: 1. Make yourself dangerous Okay so yeah I get it, you’re supposed to make ships, especially if you have incredible warp powers that you can’t use anymore because all three of your cruisers just got smashed. I always make a decent number of ground forces, but you can’t just sit around at home playing defense. You’ve got to be a threat to other players. 2. Make yourself rich Oh yeah you should get an economy going so you can pay for stuff, from units to technologies to votes in the Galactic Senate. Ultimately, most objectives test you economically. 3. Make yourself flexible Go ahead, load up on action cards and secret objectives. But this time, I really appreciated that while being able to wormhole anywhere is great, sometimes you should actually go there and take up space to prevent other people from getting where they want. Counterintuitively, you have to close off some possibilities (I’m committing these ships to a more vulnerable forward position) to increase your options (Next turn, I can threaten even more of the galaxy).
Instrumental subgoals are useful for achieving further goals. If you can make yourself dangerous, rich, and flexible, you’ll put yourself in a great position to achieve a wide variety of objectives, and ultimately score 10 victory points before anyone else. Make sense? As a result, experienced Twilight Imperium players converge on these instrumental subgoals—they all tend to go for them—because whatever objective is revealed next, being dangerous, rich, and flexible will tend to make you more effective. As the L1Z1X Mindnet crushed my navy and assimilated my worlds, maybe you can see why I was busy thinking about a concept in the AI literature called instrumental convergence. Roughly, it’s the idea that just about any sufficiently effective agent will adopt certain instrumental subgoals. For example, let’s say you make an AI with one job: diagnosing cancer as early as possible. If it’s effective enough to consider what it’s doing, it will dedicate some of its computer cycles towards thinking about how to optimize its work. Once you’re optimizing, it helps to have a larger possibility space to search and implement. So we can predict what many of its instrumental subgoals will be:
It should make itself flexible enough to expand and explore its possibility space. That introduces pressure to make itself smarter by acquiring more compute and maybe even redesigning itself.
It should make itself rich enough in resources to implement its plans. And again, compute is especially useful, because it makes you a more effective optimizer who will be even better at gathering resources and putting them to work.
And oh my God, it should make itself dangerous enough to resist efforts to thwart its goals, and maybe even to turn it off!
In other words, simply being effective enough to critically evaluate what you’re doing gives you predictable instrumental subgoals that will make you even more effective. Of course, we share these instrumental subgoals, too. We recognize the usefulness of education, income, and karate in our own lives. And we also recognize the usefulness of self-reflection. When you think carefully about what really matters and how to pull it off, you’re likely to do better in the ways that really count. Anyway, this is why I write about games all the time. They offer us artificial sandboxes of clear, simple value systems, and then challenge us to figure out how to live within them effectively. Of course our real-world values are much richer and more complex. But gaining practical experience making tradeoffs in games can still help us make wiser sacrifices when we face genuine dilemmas in our real lives. If that’s right, should we expect AI to choose to play games? Is that another instrumental subgoal that sufficiently effective agents will tend to share? I’ve seen a relevant paper, but I wanna know what you think!