Prisoner’s Dilemma is a very useful tool in the arsenal of game theorists, precisely because it is so simple to formulate and so generic to apply in modeling diverse social “problems” or interactions. The dilemma should really be no dilemma if we were all honest players (in the game theory sense of the world). Simply put, it’s a non-zero sum game where blind co-operating choices are optimal for both parties. Two “prisoners” charged with the same crime are taken into two rooms and given a choice to confess. The “payouts” are such that if both refuse to confess, they get a minimal punishment based on whatever evidence that is already there. If one confesses while the other stays silent, he is let off while the other gets a maximum punishment. If both confess, both get a medium punishment.
In terms of total punishment (to either player), both confessing is the worst case, while both staying silent (thus covertly “co-operating”) is the best case. But humans are a funny lot. And if one cannot trust the other, staying silent is the suicidal strategy, as you’re not really trying to minimize the total punishment, but rather just minimize your own possible punishment (which also should be minimum for co-operating players, but trust is not a coin that is high in circulation). The dilemma (or rather the paradox) here is that logically the two should co-operate to minimize the punishment but they both know that their worst case is if they co-operate but the other rats them out. And hence, the same logical course may make them “both” rat-out the other, and end up with a very sub-optimal payoff. In the parlance of the game theory, the player either “co-operates” (with the other player, not the authorities) or “defects” (rats out the other player).
The more interesting version of the Prisoner’s Dilemma is an “iterated” Prisoner’s Dilemma where the same game is played a number1 of times between the two players where although they cannot communicate with each other for the decision, they can use the knowledge of previous decision history (for both). It gets interesting because now you actually can either build (mutual) trust or make the other player pay for their defection, by defecting yourself, and indeed use the knowledge of all previous games to know/guess the other’s strategy, and try to exploit it.
One would think that in such an iterative version of the game, the best strategy might be some fancy algorithm that takes into account tons of things. In competitions where programs played against each other, the boringly obvious winner2 is the one using tit-for-tat strategy (co-operate first, but retaliate if other player defects). Just rat-out someone who ratted out last time, and co-operate if they co-operated the last time. In terms of minimizing average punishment (or maximizing the payoff), this simple one game memory strategy works surprisingly3 well: as good as any, and certainly better than any naively good-natured ones (so-called “generous” strategies), or the obviously sociopathic ones.
To be fair, I wasn’t planning to write about the Prisoner’s Dilemma. It is very well documented and discussed (including in the context of genetics by Richard Dawkins in The Selfish Gene). What I wanted to write about was the expectation of symmetry that we, or many (if not most) of us seem to, almost instinctively, carry in our minds with respect to our relationships. And this seemed like a very good way to impress upon the apparent soundness of such an expectation. The symmetry seems to be (from experimental data, even if not formally) essential to an elegant solution — even if a delayed symmetry — especially when the game (and again I use the word in the game theory sense, not to divorce it from serious interactions, rather the opposite) is not a zero-sum game (where one only wins by beating the other). Even if not “essential”, it definitely seems “sufficient”, because it takes care of exploitative counter-plays by simply replaying it.
But, and there is always a but, when you have a blind strategy that’s not benign/naive, although it protects from the worst case individually, symmetry basically degenerates into endless retribution. Tit-for-tat, also known as eye-for-an-eye has its limitations: as Gandhi rightly put it — it makes the world go blind. Think about it: a symmetric strategy to blind hate is blind hate. And that is where you have a measured asymmetry as a group strategy (or rather: ethos) to have some sort of stability in the system to protect against such run-away destruction of all value.
There is, of course, a more intimate version of many of these games being played in close relationships — friendships, romantic relationships, close family/blood relationships, work relationships, and so on. I wouldn’t venture to model any of those, but I think with all the above, one can intuitively see mappings onto prisoner’s dilemma. What’s clear is that any long-term relationship is a non-zero-sum game. It better be. And it’s an iterative game, not an “all the stakes in one go” game. And assuming that you want to keep on playing the game (a dimension missing in the previous discussions), which assumes you’ve found the “right” opponent (aka partner) to play it with, symmetry might be a sub-optimal strategy. Yes, it can cut the losses (or may have more equal distribution of losses), but one really has a shot at co-operating for maximum profits. And dogmatic (or calculated) dove4 strategies as Dawkins call them (co-operate more than defect) may well be optimal (or you should really be quitting the game).
All along, this is what I wanted to write about, I guess. That “asymmetry” that I’ve already discussed in past (internet protocols are asymmetric — liberal in accepting data, but conservative while sending it — and I’ve used a now archaic RFC in the context of relationships before, without really bringing in any game theory). I have believed, and continue to believe it, that it’s this very conscious “asymmetry” that is necessary for stable relationships, as it is for a stable society. But what about the defectors?
In the context of relationships, I’d broadly categorize the defectors into two types: conscious ones and unconscious ones. If you’re in a relationship (romantic or otherwise) with a conscious defector, you should really consider if playing the game is worth it. Because what conscious defectors do is reduce a non-zero-sum game into a zero-sum game — especially if the other keeps on co-operating. The “symmetric” strategy is obviously one way to stop being the loser in the zero-sum game. But that is just a value-destroying option. Tit-for-tat with a conscious defector is just turning into a conscious defector. Destroying what’s of value to you consciously is a toxic strategy for your long-term well being. So, if you ask me, the choice is very simple here, or as they say, black and white.
The most interesting “grey” area remains the unconscious defectors. Especially in romantic/spousal relationships, which involve choice as opposed to “blood ties”. Most long-term relationships are based on mutual trust, tolerance, sensitivity, and many such bedrocks. But much of this is rarely symmetric. Everyone has different natural levels of trust, sensitivity, tolerance, etc. Yes, they do evolve over time, but in the steady state, there are these levels that come into play subconsciously. What’s more, something like sensitivity is rarely symmetric in the first place, in an individual. Someone who is highly sensitive to criticism doesn’t necessarily refrain from criticising others. One who doesn’t like to be shouted at doesn’t necessarily control their vocal chords when they’re the aggrieved party.
Auden, in his infinite wisdom, had said: “If equal affection cannot be, Let the more loving one be me”. Extending that same asymmetry, then, if you’ve found someone who you believe is for the keeps, shouldn’t you aspire to be the more tolerant, more trusting, more sensitive (extrinsically), more understanding, or the more ‘noble’ for the lack of better word? But what does one do when that person is an unconscious defector? The more understanding one will have to forgive a lot more. The more tolerant one will stoically ignore offenses while paying for her lesser offenses. And so on.
Is virtue its own reward? Or is it its own punishment? Is it just a matter of threshholds? Is a dogmatic asymmetric strategy a healthy one for the virtuous co-operator, against a good-natured unconscious defector? Is the price of virtue really the price of playing the game, even when the game is rigged the moment you walk into it with dogmatic co-operative strategy? The dilemma will live on.
- The number of iterations is unknown to players – because if they are known, it ends up folding into the standard prisoner’s dilemma (provable by induction).
- Ref: https://www.forbes.com/sites/rogerkay/2011/12/19/generous-tit-for-tat-a-winning-strategy/#3492bb1e66eb (While this is true in spirit, there is a cheeky meta-strategy where sacrifice to “your own kind” strategy seems to win when the spirit of the rules is ignored: https://www.wired.com/2004/10/new-tack-wins-prisoners-dilemma/. Then again, in the real world there is no “spirit of the game”)
- I say surprisingly, but it’s not really surprising if you consider that many simple strategies do work wonderfully well in practice — (like the golden rule of Jesus of Nazareth).
- Dawkins in his book “Selfish Gene” discusses “Evolutionary Stable Strategies” and the literature on this has a primary dove vs hawk (peaceful/non-retaliatory vs aggressive/retaliatory) classification. Dove strategies survive only with doves, but even a minority of hawks will overtake the population.