Emergent behaviour: Theory and experimentation using the MANA model.
Page 1
Moffat, James, Smith, Josephine, Witty, Susan (2006)
Journal of Applied Mathematics and Decision Sciences
Krzysztof Krawiec, Wojciech Jaśkowski, Marcin Szubert (2011)
International Journal of Applied Mathematics and Computer Science
We apply Coevolutionary Temporal Difference Learning (CTDL) to learn small-board Go strategies represented as weighted piece counters. CTDL is a randomized learning technique which interweaves two search processes that operate in the intra-game and inter-game mode. Intra-game learning is driven by gradient-descent Temporal Difference Learning (TDL), a reinforcement learning method that updates the board evaluation function according to differences observed between its values for consecutively visited...
Page 1