A repeated imitation model with dependence between stages: Decision strategies and rewards

Pablo J. Villacorta; David A. Pelta

International Journal of Applied Mathematics and Computer Science (2015)

  • Volume: 25, Issue: 3, page 617-630
  • ISSN: 1641-876X

Abstract

top
Adversarial decision making is aimed at determining strategies to anticipate the behavior of an opponent trying to learn from our actions. One defense is to make decisions intended to confuse the opponent, although our rewards can be diminished. This idea has already been captured in an adversarial model introduced in a previous work, in which two agents separately issue responses to an unknown sequence of external inputs. Each agent's reward depends on the current input and the responses of both agents. In this contribution, (a) we extend the original model by establishing stochastic dependence between an agent's responses and the next input of the sequence, and (b) we study the design of time varying decision strategies for the extended model. The strategies obtained are compared against static strategies from theoretical and empirical points of view. The results show that time varying strategies outperform static ones.

How to cite

top

Pablo J. Villacorta, and David A. Pelta. "A repeated imitation model with dependence between stages: Decision strategies and rewards." International Journal of Applied Mathematics and Computer Science 25.3 (2015): 617-630. <http://eudml.org/doc/271771>.

@article{PabloJ2015,
abstract = {Adversarial decision making is aimed at determining strategies to anticipate the behavior of an opponent trying to learn from our actions. One defense is to make decisions intended to confuse the opponent, although our rewards can be diminished. This idea has already been captured in an adversarial model introduced in a previous work, in which two agents separately issue responses to an unknown sequence of external inputs. Each agent's reward depends on the current input and the responses of both agents. In this contribution, (a) we extend the original model by establishing stochastic dependence between an agent's responses and the next input of the sequence, and (b) we study the design of time varying decision strategies for the extended model. The strategies obtained are compared against static strategies from theoretical and empirical points of view. The results show that time varying strategies outperform static ones.},
author = {Pablo J. Villacorta, David A. Pelta},
journal = {International Journal of Applied Mathematics and Computer Science},
keywords = {adversarial decision making; imitation; strategies; state dependence; reward},
language = {eng},
number = {3},
pages = {617-630},
title = {A repeated imitation model with dependence between stages: Decision strategies and rewards},
url = {http://eudml.org/doc/271771},
volume = {25},
year = {2015},
}

TY - JOUR
AU - Pablo J. Villacorta
AU - David A. Pelta
TI - A repeated imitation model with dependence between stages: Decision strategies and rewards
JO - International Journal of Applied Mathematics and Computer Science
PY - 2015
VL - 25
IS - 3
SP - 617
EP - 630
AB - Adversarial decision making is aimed at determining strategies to anticipate the behavior of an opponent trying to learn from our actions. One defense is to make decisions intended to confuse the opponent, although our rewards can be diminished. This idea has already been captured in an adversarial model introduced in a previous work, in which two agents separately issue responses to an unknown sequence of external inputs. Each agent's reward depends on the current input and the responses of both agents. In this contribution, (a) we extend the original model by establishing stochastic dependence between an agent's responses and the next input of the sequence, and (b) we study the design of time varying decision strategies for the extended model. The strategies obtained are compared against static strategies from theoretical and empirical points of view. The results show that time varying strategies outperform static ones.
LA - eng
KW - adversarial decision making; imitation; strategies; state dependence; reward
UR - http://eudml.org/doc/271771
ER -

References

top
  1. Amigoni, F., Basilico, N. and Gatti, N. (2009). Finding the optimal strategies for robotic patrolling with adversaries in topologically-represented environments, Proceedings of the 26th International Conference on Robotics and Automation (ICRA'09), Kobe, Japan, pp. 819-824. 
  2. Cichosz, P. and Pawełczak, Ł. (2014). Imitation learning of car driving skills with decision trees and random forests, International Journal of Applied Mathematics and Computer Science 24(3): 579-597, DOI: 10.2478/amcs-2014-0042. Zbl1322.68149
  3. Conitzer, V. and Sandholm, T. (2006). Computing the optimal strategy to commit to, Proceedings of the 7th ACM Conference on Electronic Commerce, EC'06, Ann Arbor, MI, USA, pp. 82-90. 
  4. Kott, A. and McEneany, W.M. (2007). Adversarial Reasoning: Computational Approaches to Reading the Opponents Mind, Chapman and Hall/CRC, Boca Raton, FL. 
  5. McLennan, A. and Tourky, R. (2006). From imitation games to Kakutani, http://cupid.economics.uq.edu. au/mclennan/Papers/kakutani60.pdf, (unpublished). Zbl1200.91023
  6. McLennan, A. and Tourky, R. (2010a). Imitation games and computation, Games and Economic Behavior 70(1): 4-11. Zbl1200.91013
  7. McLennan, A. and Tourky, R. (2010b). Simple complexity from imitation games, Games and Economic Behavior 68(2): 683-688. Zbl1200.91023
  8. Osborne, M. and Rubinstein, A. (1994). A Course in Game Theory, MIT Press, Cambridge, MA. Zbl1194.91003
  9. Paruchuri, P., Pearce, J.P. and Kraus, S. (2008). Playing games for security: An efficient exact algorithm for solving Bayesian Stackelberg games, Proceedings of the 7th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS'08), Estoril, Portugal, pp. 895-902. 
  10. Pelta, D. and Yager, R. (2009). On the conflict between inducing confusion and attaining payoff in adversarial decision making, Information Sciences 179(1-2): 33-40. 
  11. Price, K., Storn, R. and Lampinen, J. (2005). Differential Evolution: A Practical Approach to Global Optimization, Natural Computing Series, Springer-Verlag New York, Inc., Syracuse, NJ. Zbl1186.90004
  12. Qin, A.K., Huang, V.L. and Suganthan, P.N. (2009). Differential evolution: Algorithm with strategy adaptation for global numerical optimization, IEEE Transactions on Evolutionary Computation 13(2): 398-417. 
  13. Storn, R. and Price, K. (1997). Differential evolution: A simple and efficient heuristic for global optimization over continuous spaces, Journal of Global Optimization 11(10): 341-359. Zbl0888.90135
  14. Tambe, M. (2012). Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned, Cambridge University Press, New York, NY. Zbl1235.91005
  15. Thagard, P. (1992). Adversarial problem solving: Modeling an opponent using explanatory coherence, Cognitive Science 16(1): 123-149. 
  16. Triguero, I., Garcia, S. and Herrera, F. (2011). Differential evolution for optimizing the positioning of prototypes in nearest neighbor classification, Pattern Recognition 44(4): 901-916. 
  17. Villacorta, P.J. and Pelta, D.A. (2012). Theoretical analysis of expected payoff in an adversarial domain, Information Sciences 186(4): 93-104. 
  18. Villacorta, P.J., Pelta, D.A. and Lamata, M.T. (2013). Forgetting as a way to avoid deception in a repeated imitation game, Autonomous Agents and Multi-Agent Systems 27(3): 329-354. 
  19. Villacorta, P. and Pelta, D. (2011). Expected payoff analysis of dynamic mixed strategies in an adversarial domain, Proceedings of the 2011 IEEE Symposium on Intelligent Agents (IA 2011), Paris, France, pp. 116-122. 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.