Parrondo's paradox.
Berresford, Geoffrey C., Rockett, Andrew M. (2003)
International Journal of Mathematics and Mathematical Sciences
Similarity:
The search session has expired. Please query the service again.
The search session has expired. Please query the service again.
Berresford, Geoffrey C., Rockett, Andrew M. (2003)
International Journal of Mathematics and Mathematical Sciences
Similarity:
R.S. Simon, S. Spiez, H. Torunczyk (2008)
RACSAM
Similarity:
We survey results related to the problem of the existence of equilibria in some classes of infinitely repeated two-person games of incomplete information on one side, first considered by Aumann, Maschler and Stearns. We generalize this setting to a broader one of principal-agent problems. We also discuss topological results needed, presenting them dually (using cohomology in place of homology) and more systematically than in our earlier papers.
Jaicer López-Rivero, Rolando Cavazos-Cadena, Hugo Cruz-Suárez (2022)
Kybernetika
Similarity:
This work is concerned with discrete-time Markov stopping games with two players. At each decision time player II can stop the game paying a terminal reward to player I, or can let the system to continue its evolution. In this latter case player I applies an action affecting the transitions and entitling him to receive a running reward from player II. It is supposed that player I has a no-null and constant risk-sensitivity coefficient, and that player II tries to minimize the utility...
Hugo Steinhaus (1949)
Colloquium Mathematicum
Similarity:
Jean-Michel Coulomb (1997)
ESAIM: Probability and Statistics
Similarity:
Manuel A. Torres-Gomar, Rolando Cavazos-Cadena, Hugo Cruz-Suárez (2024)
Kybernetika
Similarity:
This paper studies Markov stopping games with two players on a denumerable state space. At each decision time player II has two actions: to stop the game paying a terminal reward to player I, or to let the system to continue it evolution. In this latter case, player I selects an action affecting the transitions and charges a running reward to player II. The performance of each pair of strategies is measured by the risk-sensitive total expected reward of player I. Under mild continuity...
Julio Clempner (2006)
International Journal of Applied Mathematics and Computer Science
Similarity:
In this paper we introduce a new modeling paradigm for shortest path games representation with Petri nets. Whereas previous works have restricted attention to tracking the net using Bellman's equation as a utility function, this work uses a Lyapunov-like function. In this sense, we change the traditional cost function by a trajectory-tracking function which is also an optimal cost-to-target function. This makes a significant difference in the conceptualization of the problem domain,...
Tomasz Bielecki (1997)
Applicationes Mathematicae
Similarity:
The purpose of this paper is to prove existence of an ε -equilib- rium point in a dynamic Nash game with Borel state space and long-run time average cost criteria for the players. The idea of the proof is first to convert the initial game with ergodic costs to an ``equivalent" game endowed with discounted costs for some appropriately chosen value of the discount factor, and then to approximate the discounted Nash game obtained in the first step with a countable state space game for which...
Rolando Cavazos-Cadena, Luis Rodríguez-Gutiérrez, Dulce María Sánchez-Guillermo (2021)
Kybernetika
Similarity:
This work is concerned with discrete-time zero-sum games with Markov transitions on a denumerable space. At each decision time player II can stop the system paying a terminal reward to player I, or can let the system to continue its evolution. If the system is not halted, player I selects an action which affects the transitions and receives a running reward from player II. Assuming the existence of an absorbing state which is accessible from any other state, the performance of a pair...
Krzysztof Krawiec, Wojciech Jaśkowski, Marcin Szubert (2011)
International Journal of Applied Mathematics and Computer Science
Similarity:
We apply Coevolutionary Temporal Difference Learning (CTDL) to learn small-board Go strategies represented as weighted piece counters. CTDL is a randomized learning technique which interweaves two search processes that operate in the intra-game and inter-game mode. Intra-game learning is driven by gradient-descent Temporal Difference Learning (TDL), a reinforcement learning method that updates the board evaluation function according to differences observed between its values for consecutively...
John E. Walsh, Grace J. Kelleher (1970)
RAIRO - Operations Research - Recherche Opérationnelle
Similarity: