Raise Your Hand if You’re Not a Spy

The summer is approaching its halfway mark. My work routine is well established and I eat an absurdly well prepared lunch every day. Most importantly, I have a great group of friends with which to play a nearly endless stream of games. For whatever reason, a good half of these games – Resistance, Avalon, Coup, One Night: Ultimate Werewolf, and Spyfall –  are centered around the mechanic of hidden roles. At the beginning of each game, each player is given a hidden role that determines what “team” they are playing on. For the bulk of these games, there are “the good guys” (hitherto the Resistance) and “the bad guys” (hitherto the Spies). The spies are trying to screw up the course of the game enough to claim victory, whereas the resistance is trying to determine who the spies are so that they can exclude them from deliberation and thus ensure that they win the game. If the spies are too obvious about their allegiance, their identity will be blown and they will lose the game. However, if they are too subtle, they won’t affect the game enough to tilt the balance in their favor. Meanwhile, the Resistance must determine who the spies are to avoid casting blame on another innocent member, and being targeted as spies themselves.

The games test your ability to lie, and more importantly, how trustworthy you are to the other players. The first few times I played each game I did the whole song and dance. When I was on the good guy side I analyzed every word, trying to divine the spies from fellow resistance. When a spy, I threw around just enough blame to continue being included in the action, and contributed to the find-the-spy analysis. After two to three plays of each game, however, I was repeatedly struck with the realization that each game is not totally independent. There is a crucial game element that is not refreshed by game, that changes the time bounds on optimal play from an hour to a few months. Do you know what it is?

Before I reveal and elaborate on this game piece, we must take a detour into game theory to show why it will become a problem. So hold the suspense for a moment and put your learning cap on. In The Prisoner’s Dilemma (sorry if you’ve already studied it three times), two men are arrested at the scene of a murder. The police know they are guilty, but lack the evidence to close the case. The only thing that could get the murderers in jail is for the criminals to either confess (not likely) or rat out the other murder. Immediately upon entrance to the police station the two men are placed into separate rooms and are not able to communicate with one another. Fortunately for the police, each man was holding jewelry from the scene of the crime when they were apprehended. Thus, if the police fail to convict the men for murder, they still have theft as a much smaller but non-trivial conviction. Hoping to use this smaller conviction as a counter weight, they police present both men with the following set of possibilities:

  • If you confess to your guilt and you were in a pair, you’ll go to jail for murder for 20 years
  • If you confess to your guilt and it was just you, you’ll go to jail for 30 years
  • If you’re innocent, you’ll still go to jail for theft, for 6 months to a year
  • However, if you’re truly innocent and help us out by ratting out your partner, we’ll let you go free

We define a strategy as a function that, given what your opponent does, outputs what you decide to do. For example, one strategy is to always mirror your opponent – Confess if your opponent confesses, stay silent if your opponent stays silent.

Each man is asked the following question: “Did your partner commit the murder?” The convicts aren’t in for a good time – not only will rational people rat our their partner, it is a dominant strategy to rat out the other murderer. In the context of game theory, a dominant strategy is a choice of strategy that results in the best possible outcome regardless of what the other player does. To see that this is in fact the case, consider the following set of payout possibilities:

  • P1 – Silent, P2 – Silent : Both get 6 months
  • P1 – Silent, P2 – Rat : P1 gets 30 years, P2 walks
  • P1 – Rat, P2 – Silent : P1 walks, P2 gets 30 years
  • P1 – Rat, P2 – Rat : Both get 20 years

For P1, choosing silence means you are either getting 6 months (if P2 is silent) or 30 years (if P2 rats you out). In contrast, choosing to rat out P2 means you are either walking (if P2 is silent) or 20 years (if P2 rats you out). Either way, P1 is better off ratting out P2. The same exact analysis applies for P2. Thus, the only equilibrium outcome is for both players to rat the other out.

What does this have to do with hidden-role games? Nothing, yet. Consider the repeated variant of the game, called The Iterative Prisoner’s Dilemma. In this version, the same two players will be playing with each other repeatedly, for some indefinite number of iterations. (The fact that the total number of iterations of the game is indefinite or infinite is very important for the forthcoming analysis. The rationale as to why the assumption is required is left as an exercise to the reader.)  Let’s alter the payoffs so they make more sense for multiple rounds:

  • P1 – Silent, P2 – Silent : Both get $3
  • P1 – Silent, P2 – Rat : P1 gets $1, P1 gets $4
  • P1 – Rat, P2 – Silent : P1 gets $4, P2 gets $1
  • P1 – Rat – P2 – Rat : Both get $2

Thus, at the end of the game, each player’s payoff is the sum of the money they made in all of the rounds. A strategy is now defined as a function that takes the outcomes of all past rounds of play and outputs your choice of action for this round.

The big question, now, is whether or not the single game dominant strategy of always ratting out your opponent is still dominant. The only rational response to your opponent always ratting is to always rat him out as well. This yields a payout of $2 * n, where n is the number of rounds of play. Can we do better? (The answer, as with so many questions in economics, is that it depends. Showing why requires a discussion of time based discounting – $1 today is worth more than $1 tomorrow – which is too far off topic.)

Consider a new strategy, which is called The Grim Trigger Strategy. We will cooperate and stay silent so long as our opponent cooperates as well. As soon as we are ratted out, we will rat our opponent out in every subsequent round until the game ends. If our always-rat strategy player plays against a Grim Trigger strategy, it will make $4 in the first round, then $2 in every subsequent round. But what if it cooperated in that first round instead, then switched to ratting afterwards? Then it would make $3 in round one, $4 in round two, and $2 in every subsequent round. That’s a straight increase of $1! But we could repeat this analysis – the ratting player could actually never rat, and make $3 in every round, as the Grim Trigger strategy won’t play rat until it sees rat. This yields a payout of $3 * n to each player, which is much better than before. Thus, we’ve beaten the formerly dominant strategy of always ratting out your opponent. When both players know that they will be playing many more times with their opponent, there are great incentives to be nice now, in hopes of receiving cooperation later.

This brings us all the way back to Hidden-Role games. Did you figure out what game element persists through the games? Either way, here’s the answer: we do. In games based on our personalities, our ability to lie and our beliefs about others is a constant that isn’t re-randomized before each game. Because I knew I would be playing the same game many times with relatively the same set of opponents, it began to pay off to play the games not simply to win, but to best set myself up for wins in future games. Unfortunately for the designers of these games, it turns out that the best way to play these games with respect to winning a series is to be a crappy liar.

My first option is to attempt to play each individual game to the best of my ability. I’m a terrible liar to begin with, so I would estimate my chance of winning each game at best to be 50%. So out of 10 games, I could expect to win 5 games. My second option is to basically forfeit the games in which I am chosen to be a spy – never deny that I’m a spy, and on occasion openly admit it. By doing this, it means that when I am not a spy I can simply say so and receive the full trust of everyone, greatly helping the resistance. I skew my chances of winning based on what role I am chosen to play – I still can win games as a spy if my partners are good, but it lowers my chances to about 30%. On the other hand, having a guaranteed resistance member raises my chances of winning when resistance to 65%-70%. It would seem that I’m not affecting my expected outcome, if it weren’t for the fact that there are twice as many (or more) resistance players as there are spies. Thus out of 10 games I’m a resistance member approximately 7 times and a spy 3 times, thus I can expect to win 70% * 7 + 30% * 3 = 4.9 + 0.9 = 5.8 games. Quite a nice upgrade.

This seems great and all, but why not capitalize on that trust, and use it to win a crucial game as a spy? For that, we look back at the iterative prisoner’s dilemma. People don’t forgive taking advantage of their trust, even in games, and thus I can assume for modeling that they’ll behave at least somewhat like a grim-trigger player, if not a trigger player with a refresh of some large number of rounds. They’ll start out trusting you until you give them a reason not to, but once you cross them they are slow to forgive. So by capitalizing on the trust I’ve built up, I’m trading winning the current game for making the next long series of games a struggle. My discount factor is low enough that I’d rather win many games in the long run than one game now. Like the always-ratter facing against the Grim Trigger, I’ll put off ratting indefinitely because that adds to a better payoff.

The main complaint against this form of reasoning begins with “But if everyone plays like this….” If you’re thinking this, you’ve hit against The Tragedy of the Commons, named for the abuse of common land in Victorian-era England. Unfortunately, what would happen if everyone followed suit doesn’t affect my behavior. So any argument starting with the above holds no water.

Thats all for now! Enjoy your summers, and happy 4th of July!

Advertisements