Parrondo's paradox.
Berresford, Geoffrey C., Rockett, Andrew M. (2003)
International Journal of Mathematics and Mathematical Sciences
Similarity:
Berresford, Geoffrey C., Rockett, Andrew M. (2003)
International Journal of Mathematics and Mathematical Sciences
Similarity:
R.S. Simon, S. Spiez, H. Torunczyk (2008)
RACSAM
Similarity:
We survey results related to the problem of the existence of equilibria in some classes of infinitely repeated two-person games of incomplete information on one side, first considered by Aumann, Maschler and Stearns. We generalize this setting to a broader one of principal-agent problems. We also discuss topological results needed, presenting them dually (using cohomology in place of homology) and more systematically than in our earlier papers.
Jaicer López-Rivero, Rolando Cavazos-Cadena, Hugo Cruz-Suárez (2022)
Kybernetika
Similarity:
This work is concerned with discrete-time Markov stopping games with two players. At each decision time player II can stop the game paying a terminal reward to player I, or can let the system to continue its evolution. In this latter case player I applies an action affecting the transitions and entitling him to receive a running reward from player II. It is supposed that player I has a no-null and constant risk-sensitivity coefficient, and that player II tries to minimize the utility...
Hugo Steinhaus (1949)
Colloquium Mathematicum
Similarity:
Jean-Michel Coulomb (1997)
ESAIM: Probability and Statistics
Similarity:
Manuel A. Torres-Gomar, Rolando Cavazos-Cadena, Hugo Cruz-Suárez (2024)
Kybernetika
Similarity:
This paper studies Markov stopping games with two players on a denumerable state space. At each decision time player II has two actions: to stop the game paying a terminal reward to player I, or to let the system to continue it evolution. In this latter case, player I selects an action affecting the transitions and charges a running reward to player II. The performance of each pair of strategies is measured by the risk-sensitive total expected reward of player I. Under mild continuity...
Julio Clempner (2006)
International Journal of Applied Mathematics and Computer Science
Similarity:
In this paper we introduce a new modeling paradigm for shortest path games representation with Petri nets. Whereas previous works have restricted attention to tracking the net using Bellman's equation as a utility function, this work uses a Lyapunov-like function. In this sense, we change the traditional cost function by a trajectory-tracking function which is also an optimal cost-to-target function. This makes a significant difference in the conceptualization of the problem domain,...
Tomasz Bielecki (1997)
Applicationes Mathematicae
Similarity:
The purpose of this paper is to prove existence of an ε -equilib- rium point in a dynamic Nash game with Borel state space and long-run time average cost criteria for the players. The idea of the proof is first to convert the initial game with ergodic costs to an ``equivalent" game endowed with discounted costs for some appropriately chosen value of the discount factor, and then to approximate the discounted Nash game obtained in the first step with a countable state space game for which...
Rolando Cavazos-Cadena, Luis Rodríguez-Gutiérrez, Dulce María Sánchez-Guillermo (2021)
Kybernetika
Similarity:
This work is concerned with discrete-time zero-sum games with Markov transitions on a denumerable space. At each decision time player II can stop the system paying a terminal reward to player I, or can let the system to continue its evolution. If the system is not halted, player I selects an action which affects the transitions and receives a running reward from player II. Assuming the existence of an absorbing state which is accessible from any other state, the performance of a pair...
Krzysztof Krawiec, Wojciech Jaśkowski, Marcin Szubert (2011)
International Journal of Applied Mathematics and Computer Science
Similarity:
We apply Coevolutionary Temporal Difference Learning (CTDL) to learn small-board Go strategies represented as weighted piece counters. CTDL is a randomized learning technique which interweaves two search processes that operate in the intra-game and inter-game mode. Intra-game learning is driven by gradient-descent Temporal Difference Learning (TDL), a reinforcement learning method that updates the board evaluation function according to differences observed between its values for consecutively...
John E. Walsh, Grace J. Kelleher (1970)
RAIRO - Operations Research - Recherche Opérationnelle
Similarity: