Under that strategy, a player's first move is always to cooperate with other players. Afterward, the player echoes whatever the other players do. The strategy is similar to the one nuclear powers adopted during the Cold War, each promising not to use its weaponry so long as the other side refrained from doing so as well. The 20th-anniversary competition was the brainchild of Graham Kendall , a lecturer in the University of Nottingham's School of Computer Science and Information Technology and a researcher in game theory, and was based on the original competition run by a University of Michigan political scientist, Robert Axelrod.
The Iterated Prisoner's Dilemma is a version of the game in which the choice is repeated over and over again and in which the players can remember their previous moves, allowing them to evolve a cooperative strategy. The competition had entries, with each player playing all the other players in a round robin setup. Because Axelrod's original competition was run twice, Kendall will run a second competition in April , for which he hopes to attract even more entries.
Teams could submit multiple strategies, or players, and the Southampton team submitted 60 programs. These, Jennings explained, were all slight variations on a theme and were designed to execute a known series of five to 10 moves by which they could recognize each other.
- Login using.
- Account Options.
- The Iterated Prisoners' Dilemma: 20 Years On - Graham Kendall - Google книги;
Once two Southampton players recognized each other, they were designed to immediately assume "master and slave" roles — one would sacrifice itself so the other could win repeatedly. If the program recognized that another player was not a Southampton entry, it would immediately defect to act as a spoiler for the non-Southampton player. The result is that Southampton had the top three performers — but also a load of utter failures at the bottom of the table who sacrificed themselves for the good of the team.
Another twist to the game was the addition of noise, which allowed some moves to be deliberately misrepresented. In the original game, the two prisoners could not communicate. But Southampton's design lets the prisoners do the equivalent of signaling to each other their intentions by tapping in Morse code on the prison wall. Kendall noted that there was nothing in the competition rules to preclude such a strategy, though he admitted that the ability to submit multiple players means it's difficult to tell whether this strategy would really beat Tit for Tat in the original version.
But he believes it would be impossible to prevent collusion between entrants. What was interesting was to see how many colluders you need in a population. It turns out we had far too many — we would have won with around Jennings is also interested in testing the strategy on an evolutionary variant of the game in which each player plays only its neighbors on a grid.
If your neighbors do better than you do, you adopt their strategy. But, says Kendall, "Everybody in our field knows the name of Anatol Rapoport, who won the Axelrod competition. Considering the complexity of the ZD strategies, we provided human subjects with paper and pen to record their decision choices and scores round by round. In the awareness treatments, the human subjects were told that they would play a game with a fixed computer program. On the contrary, in the unawareness treatments, the human subjects were only told that they would play a game with a fixed opponent, which is similar to the implementation by Hilbe et al.
The instruction manual of A is provided as an example in Supplementary Note 2. To obtain the desired number of rounds of data or 60 for each subject while avoiding end-of-game effects 55 , human subjects were informed that the game would end with probability 0. During the experiments, the player earned scores according to the payoff matrix see Fig.
After the experiment, the sum of scores were converted to cash according to an exchange rate and paid to the subjects. For more details, see Supplementary Notes 1, 2 and 3. In addition, we used the binomial probability test for comparison of the number of the Extortionate strategists who obtained an average score higher than the score from mutual cooperation, and the number of the extortionate strategists who obtained an average score less than or equal to the score from mutual cooperation. How to cite this article: Wang, Z.
Rapoport, A. Trivers, R. The evolution of reciprocal altruism. Axelrod, R. The evolution of cooperation. Science , — The Evolution of Cooperation Basic Books Fudenberg, D. The folk theorem in repeated games with discounting or with incomplete information. Econometrica 54 , — Boyd, R. Nature , 58—59 Kendall, G. Fehr, E. A theory of fairness, competition, and cooperation. Brosnan, S. Monkeys reject unequal pay. Nature , — Oechssler, J.
Finitely repeated games with social preferences. Friedman, J. A non-cooperative equilibrium for supergames. Binmore, K. Evolutionary stability in repeated games played by finite automata. Theory 57 , — Nowak, M. Nature , 56—58 Wedekind, C. Natl Acad. USA 93 , — Duffy, J. Cooperative behavior and the frequency of social interaction. Games Econom.
The evolution of cooperation in infinitely repeated games: Experimental evidence. Molander, P. The optimal level of generosity in a selfish, uncertain environment. Conflict Resol. Tit for tat in heterogeneous populations. Stephens, D. Emergence of cooperation and evolutionary stability in finite populations. Press, W. USA , — Stewart, A. Extortion and cooperation in the prisoners dilemma. Hilbe, C. Evolution of extortion in iterated prisoners dilemma games. From extortion to generosity, evolution in the iterated prisoners dilemma. Collapse of cooperation in evolving games. Cooperation and control in multiplayer social dilemmas.
- How Many People Listen to the President's Calls Anyway?.
- Parenting after Divorce: a complete guide.
- Post Everything: Outsider Rock and Roll?
Szolnoki, A. Defection and extortion as unexpected catalysts of unconditional cooperation in structured populations. Evolution of extortion in structured populations. E 89 , Hao, D. Extortion under uncertainty: zero-determinant strategies in noisy games. E 91 , Bi, Z. Optimal cooperation-trap strategies for the iterated rock-paper-scissors game. Bruggeman, J. Partners or rivals? Levine, D. The relationship between economic theory and experiments. Blount, S. Harrison, G. Expectations and fairness in a simple bargaining experiment.
Game Theory 25 , — Johnson, E. Detecting failures of backward induction: Monitoring information search in sequential bargaining. Theory , 16—47 Gintis, H. Solving the puzzle of prosociality.https://kalraibredam.gq
An iterated prisoners dilemma on github
Fischbacher, U. Are people conditionally cooperative? Evidence from a public goods experiment. Social preferences, beliefs, and the dynamics of free riding in public goods experiments.
Chaudhuri, A. Conditional cooperation and voluntary contributions to a public good. Selten, R. Crawford, V. Adaptive dynamics in coordination games. Econometrica 63 , — Steady state learning and nash equilibrium. Econometrica 61 , — Erev, I. Learning and equilibrium as useful approximations: accuracy of prediction on randomly selected constant sum games. Theory 33 , 29—51 Xu, B. Cycle frequency in standard rock—paper—scissors games: evidence from experimental economics. A , — Wang, Z. Social cycling and conditional responses in the rock-paper-scissors game. Camerer, C. Cooperation under the shadow of the future: experimental evidence from infinitely repeated games.
Simaan, M. Jr On the stackelberg strategy in nonzero-sum games. Theory Appl. Winden, F. Kandori, M. Repeated games. Download references. We thank Prof. Xunda Yu's support. We are very grateful to the editors and two anonymous referees for their comments which significantly improved the study and the manuscript. All the authors prepared the Supplementary Material. Correspondence to Bin Xu. This work is licensed under a Creative Commons Attribution 4. Reprints and Permissions.
Nature Communications Physica A: Statistical Mechanics and its Applications Journal of Theoretical Biology Scientific Reports Nature Human Behaviour By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate. Advanced search. Skip to main content. Subjects Decision Evolution. Figure 1: Payoff matrix. Full size image. Figure 2: Average scores for human subjects and the ZD strategies for all treatments.
Figure 3: Human cooperation rates over the course of the game. Table 1 Experimental design. Full size table.
Reinforcement learning produces dominant strategies for the Iterated Prisoner’s Dilemma
Figure 4: Experimental scores and theoretical prediction. Methods Data source and experimental setting The data was generated from our laboratory experiments which were conducted at The Experimental Social Science Laboratory of Zhejiang University. Additional information How to cite this article: Wang, Z.
References 1 Rapoport, A. Article Google Scholar 3 Axelrod, R. Article Google Scholar 9 Brosnan, S. Article Google Scholar 11 Friedman, J. Article Google Scholar 17 Molander, P. Article Google Scholar 18 Nowak, M. Article Google Scholar 34 Hilbe, C. Article Google Scholar 37 Harrison, G. Article Google Scholar 39 Gintis, H.