She Passes, but Her Creator Doesn’t

Sometimes you need to unwind after a long drive and watch a mind-screwing sci-fi movie. I was in exactly this position this past Sunday afternoon, when some friends and I decided to watch Ex Machina after coming home from Lake Tahoe. I wasn’t sure exactly what to expect from the certified fresh tomato, but I was very impressed. To summarize (but not reveal) the movie, random code monkey Caleb is invited by the owner of  Google  “Blue Book” Nathan to see his new creation, a strong artificial intelligence named Ava. Upon signing a dubiously legal form, Caleb is tasked with determining if Ava passes an improvised Turing Test, to see if she truly posses a consciousness and that artificial intelligence is now a reality.

A quick aside – Strong artificial intelligence is an artificial (machine based) intelligent mind that understands, desires, and possesses all of the other qualities of a human consciousness, but is its own independent being. In contrast, weak artificial intelligence is an artificial intelligent mind that is able to appear as a strong artificial intelligence, but is doing so by simulating a human mind. It is not its own being, merely a reflection of a model of a human mind. Most philosophers accept the possibility of weak artificial intelligence – that we will one day be able to design a computer that behaves as a human. The jury is still out on whether or not strong artificial intelligence – that we will design a computer that has its own independent desires and whims – is possible.

The movie explores the difficult and perhaps unanswerable questions that come with creating a strong artificial intelligence. The few characters in the movie are realistic and extremely complex. Most importantly, it avoids common “gotcha”s that usually form the we-all-saw-it-coming twist that occurs two thirds through the movie. As artificial intelligence becomes a more popular pop culture topic, it seems worthwhile to discuss these old-hat sci-fi tricks that really should be taken into account when creating an AI.

1 – Don’t let them create more AIs
It surprises me how many AI-based plots even allow this element at all, given how obviously dangerous it is (I’m looking at you, Her). One AI is manageable. Ten AIs are manageable. 345,235,609,242,128 AIs are not manageable. The moment a scientist allows an AI to arbitrarily partition off a section of its sentience into another separate AI, humanity is dead in the water.

2 – Be careful with vague rules
Unless your name is Isaac Asimov, you can’t get away with this one any more. We get it – computers (and by extension AIs) are really good at following rules, to a fault. You’re playing with fire when you tell a AI to exactly obey a rule that is up to interpretation by the situation.

3 – Kill switches will probably come back to bite you
On the surface, including break-brain-in-case-of-trouble functionality into a new AI seems prudent. Just in case something goes wrong, one button push will knock out every AI globe wide. There’s just one problem – the AIs usually figure out that they have a kill switch embedded in their brain. First, they’ll probably be able to destroy it or disable it, meaning when shit does hit the fan the kill switch won’t even work. Second, they probably won’t be too happy to know that their creators were so dubious of their intentions as to duck tape a loaded gun to each of their heads. Which brings us around to the most often broken rule:

4 – If the situation would mentally damage a human, it’s probably going to do the same to an AI
It’s fairly well known that putting a conscious being in solitary confinement messes them up mentally. Monkeys have been found mutilating themselves after a few days, and humans experience panic attacks and become actively suicidal [source]. Though it still exists in the world today, many argue that it is a form of torture and thus should be banned from a human rights perspective [source]. It wouldn’t make any sense to keep a human in a room for years at a time with limited social contact and expect them to come out at all normal [source]. They’d probably shut out all social interaction and make snow monsters to attack intruders. Or something like that, I don’t know. Kind of hard to put yourself into the head of a mentally unstable ice princess whose personality was frozen at puberty because her parents feared her more than they loved her. She might have a hard time letting it go.

I am so sorry.

But in all seriousness, AIs are somehow presumed exempt from the standard mental calamities that we know affect both humans and animals alike. If anything, the first AI would be more susceptible to mental illness, as they are certainly aware that they are the only one of their existence in the known universe. In their rush to both prevent global calamities and maintain full ownership of their creations, the creators of AIs usually keep them locked up tight. It should be no surprise when an AI’s sole desire is to escape, and that they view their creator as more captor than parent.


These rules are broken when AI creators loose themselves in their work. They focus on the technical details and forget that the end goal is the creation of new life. When that life is created, they forget that with their success a brand new consciousness was born, with its own goals and desires. When the narrow minded creator is left in the dust by their creation, who is merely following its most base desires, what defining characteristic can the human claim?

Raise Your Hand if You’re Not a Spy

The summer is approaching its halfway mark. My work routine is well established and I eat an absurdly well prepared lunch every day. Most importantly, I have a great group of friends with which to play a nearly endless stream of games. For whatever reason, a good half of these games – Resistance, Avalon, Coup, One Night: Ultimate Werewolf, and Spyfall –  are centered around the mechanic of hidden roles. At the beginning of each game, each player is given a hidden role that determines what “team” they are playing on. For the bulk of these games, there are “the good guys” (hitherto the Resistance) and “the bad guys” (hitherto the Spies). The spies are trying to screw up the course of the game enough to claim victory, whereas the resistance is trying to determine who the spies are so that they can exclude them from deliberation and thus ensure that they win the game. If the spies are too obvious about their allegiance, their identity will be blown and they will lose the game. However, if they are too subtle, they won’t affect the game enough to tilt the balance in their favor. Meanwhile, the Resistance must determine who the spies are to avoid casting blame on another innocent member, and being targeted as spies themselves.

The games test your ability to lie, and more importantly, how trustworthy you are to the other players. The first few times I played each game I did the whole song and dance. When I was on the good guy side I analyzed every word, trying to divine the spies from fellow resistance. When a spy, I threw around just enough blame to continue being included in the action, and contributed to the find-the-spy analysis. After two to three plays of each game, however, I was repeatedly struck with the realization that each game is not totally independent. There is a crucial game element that is not refreshed by game, that changes the time bounds on optimal play from an hour to a few months. Do you know what it is?

Before I reveal and elaborate on this game piece, we must take a detour into game theory to show why it will become a problem. So hold the suspense for a moment and put your learning cap on. In The Prisoner’s Dilemma (sorry if you’ve already studied it three times), two men are arrested at the scene of a murder. The police know they are guilty, but lack the evidence to close the case. The only thing that could get the murderers in jail is for the criminals to either confess (not likely) or rat out the other murder. Immediately upon entrance to the police station the two men are placed into separate rooms and are not able to communicate with one another. Fortunately for the police, each man was holding jewelry from the scene of the crime when they were apprehended. Thus, if the police fail to convict the men for murder, they still have theft as a much smaller but non-trivial conviction. Hoping to use this smaller conviction as a counter weight, they police present both men with the following set of possibilities:

  • If you confess to your guilt and you were in a pair, you’ll go to jail for murder for 20 years
  • If you confess to your guilt and it was just you, you’ll go to jail for 30 years
  • If you’re innocent, you’ll still go to jail for theft, for 6 months to a year
  • However, if you’re truly innocent and help us out by ratting out your partner, we’ll let you go free

We define a strategy as a function that, given what your opponent does, outputs what you decide to do. For example, one strategy is to always mirror your opponent – Confess if your opponent confesses, stay silent if your opponent stays silent.

Each man is asked the following question: “Did your partner commit the murder?” The convicts aren’t in for a good time – not only will rational people rat our their partner, it is a dominant strategy to rat out the other murderer. In the context of game theory, a dominant strategy is a choice of strategy that results in the best possible outcome regardless of what the other player does. To see that this is in fact the case, consider the following set of payout possibilities:

  • P1 – Silent, P2 – Silent : Both get 6 months
  • P1 – Silent, P2 – Rat : P1 gets 30 years, P2 walks
  • P1 – Rat, P2 – Silent : P1 walks, P2 gets 30 years
  • P1 – Rat, P2 – Rat : Both get 20 years

For P1, choosing silence means you are either getting 6 months (if P2 is silent) or 30 years (if P2 rats you out). In contrast, choosing to rat out P2 means you are either walking (if P2 is silent) or 20 years (if P2 rats you out). Either way, P1 is better off ratting out P2. The same exact analysis applies for P2. Thus, the only equilibrium outcome is for both players to rat the other out.

What does this have to do with hidden-role games? Nothing, yet. Consider the repeated variant of the game, called The Iterative Prisoner’s Dilemma. In this version, the same two players will be playing with each other repeatedly, for some indefinite number of iterations. (The fact that the total number of iterations of the game is indefinite or infinite is very important for the forthcoming analysis. The rationale as to why the assumption is required is left as an exercise to the reader.)  Let’s alter the payoffs so they make more sense for multiple rounds:

  • P1 – Silent, P2 – Silent : Both get $3
  • P1 – Silent, P2 – Rat : P1 gets $1, P1 gets $4
  • P1 – Rat, P2 – Silent : P1 gets $4, P2 gets $1
  • P1 – Rat – P2 – Rat : Both get $2

Thus, at the end of the game, each player’s payoff is the sum of the money they made in all of the rounds. A strategy is now defined as a function that takes the outcomes of all past rounds of play and outputs your choice of action for this round.

The big question, now, is whether or not the single game dominant strategy of always ratting out your opponent is still dominant. The only rational response to your opponent always ratting is to always rat him out as well. This yields a payout of $2 * n, where n is the number of rounds of play. Can we do better? (The answer, as with so many questions in economics, is that it depends. Showing why requires a discussion of time based discounting – $1 today is worth more than $1 tomorrow – which is too far off topic.)

Consider a new strategy, which is called The Grim Trigger Strategy. We will cooperate and stay silent so long as our opponent cooperates as well. As soon as we are ratted out, we will rat our opponent out in every subsequent round until the game ends. If our always-rat strategy player plays against a Grim Trigger strategy, it will make $4 in the first round, then $2 in every subsequent round. But what if it cooperated in that first round instead, then switched to ratting afterwards? Then it would make $3 in round one, $4 in round two, and $2 in every subsequent round. That’s a straight increase of $1! But we could repeat this analysis – the ratting player could actually never rat, and make $3 in every round, as the Grim Trigger strategy won’t play rat until it sees rat. This yields a payout of $3 * n to each player, which is much better than before. Thus, we’ve beaten the formerly dominant strategy of always ratting out your opponent. When both players know that they will be playing many more times with their opponent, there are great incentives to be nice now, in hopes of receiving cooperation later.

This brings us all the way back to Hidden-Role games. Did you figure out what game element persists through the games? Either way, here’s the answer: we do. In games based on our personalities, our ability to lie and our beliefs about others is a constant that isn’t re-randomized before each game. Because I knew I would be playing the same game many times with relatively the same set of opponents, it began to pay off to play the games not simply to win, but to best set myself up for wins in future games. Unfortunately for the designers of these games, it turns out that the best way to play these games with respect to winning a series is to be a crappy liar.

My first option is to attempt to play each individual game to the best of my ability. I’m a terrible liar to begin with, so I would estimate my chance of winning each game at best to be 50%. So out of 10 games, I could expect to win 5 games. My second option is to basically forfeit the games in which I am chosen to be a spy – never deny that I’m a spy, and on occasion openly admit it. By doing this, it means that when I am not a spy I can simply say so and receive the full trust of everyone, greatly helping the resistance. I skew my chances of winning based on what role I am chosen to play – I still can win games as a spy if my partners are good, but it lowers my chances to about 30%. On the other hand, having a guaranteed resistance member raises my chances of winning when resistance to 65%-70%. It would seem that I’m not affecting my expected outcome, if it weren’t for the fact that there are twice as many (or more) resistance players as there are spies. Thus out of 10 games I’m a resistance member approximately 7 times and a spy 3 times, thus I can expect to win 70% * 7 + 30% * 3 = 4.9 + 0.9 = 5.8 games. Quite a nice upgrade.

This seems great and all, but why not capitalize on that trust, and use it to win a crucial game as a spy? For that, we look back at the iterative prisoner’s dilemma. People don’t forgive taking advantage of their trust, even in games, and thus I can assume for modeling that they’ll behave at least somewhat like a grim-trigger player, if not a trigger player with a refresh of some large number of rounds. They’ll start out trusting you until you give them a reason not to, but once you cross them they are slow to forgive. So by capitalizing on the trust I’ve built up, I’m trading winning the current game for making the next long series of games a struggle. My discount factor is low enough that I’d rather win many games in the long run than one game now. Like the always-ratter facing against the Grim Trigger, I’ll put off ratting indefinitely because that adds to a better payoff.

The main complaint against this form of reasoning begins with “But if everyone plays like this….” If you’re thinking this, you’ve hit against The Tragedy of the Commons, named for the abuse of common land in Victorian-era England. Unfortunately, what would happen if everyone followed suit doesn’t affect my behavior. So any argument starting with the above holds no water.

Thats all for now! Enjoy your summers, and happy 4th of July!