10.1007/11871842_29
Crossref book-chapter
Springer Berlin Heidelberg
Lecture Notes in Computer Science (297)
Bibliography

Kocsis, L., & Szepesvári, C. (2006). Bandit Based Monte-Carlo Planning. Machine Learning: ECML 2006, 282–293.

Authors 2
  1. Levente Kocsis (first)
  2. Csaba Szepesvári (additional)
References 14 Referenced 1,169
  1. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002) (10.1023/A:1013689704352) / Machine Learning by P. Auer (2002)
  2. Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM Journal on Computing 32, 48–77 (2002) (10.1137/S0097539701398375) / SIAM Journal on Computing by P. Auer (2002)
  3. Barto, A.G., Bradtke, S.J., Singh, S.P.: Real-time learning and control using asynchronous dynamic programming. Technical report 91-57, Computer Science Department, University of Massachusetts (1991)
  4. Billings, D., Davidson, A., Schaeffer, J., Szafron, D.: The challenge of poker. Artificial Intelligence 134, 201–240 (2002) (10.1016/S0004-3702(01)00130-8) / Artificial Intelligence by D. Billings (2002)
  5. Bouzy, B., Helmstetter, B.: Monte Carlo Go developments. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) Advances in Computer Games 10, pp. 159–174 (2004) (10.1007/978-0-387-35706-5_11)
  6. Chang, H.S., Fu, M., Hu, J., Marcus, S.I.: An adaptive sampling algorithm for solving Markov decision processes. Operations Research 53(1), 126–139 (2005) (10.1287/opre.1040.0145) / Operations Research by H.S. Chang (2005)
  7. Chung, M., Buro, M., Schaeffer, J.: Monte Carlo planning in RTS games. In: CIG 2005, Colchester, UK (2005)
  8. Kearns, M., Mansour, Y., Ng, A.Y.: A sparse sampling algorithm for near-optimal planning in large Markovian decisi on processes. In: Proceedings of IJCAI 1999, pp. 1324–1331 (1999)
  9. Lai, T.L., Robbins, H.: Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics 6, 4–22 (1985) (10.1016/0196-8858(85)90002-8) / Advances in Applied Mathematics by T.L. Lai (1985)
  10. Péret, L., Garcia, F.: On-line search for solving Markov decision processes via heuristic sampling. In: de Mántaras, R.L., Saitta, L. (eds.) ECAI, pp. 530–534 (2004)
  11. Sheppard, B.: World-championship-caliber Scrabble. Artificial Intelligence 134(1–2), 241–275 (2002) (10.1016/S0004-3702(01)00166-7) / Artificial Intelligence by B. Sheppard (2002)
  12. Smith, S.J.J., Nau, D.S.: An analysis of forward pruning. In: AAAI, pp. 1386–1391 (1994)
  13. Tesauro, G., Galperin, G.R.: On-line policy improvement using Monte-Carlo search. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) NIPS 9, pp. 1068–1074 (1997)
  14. Vanderbei, R.: Optimal sailing strategies, statistics and operations research program. University of Princeton (1996), http://www.sor.princeton.edu/~rvdb/sail/sail.html
Dates
Type When
Created 18 years, 11 months ago (Sept. 18, 2006, 4:28 a.m.)
Deposited 4 years, 4 months ago (April 27, 2021, 3:24 a.m.)
Indexed 2 days, 20 hours ago (Aug. 26, 2025, 2:41 a.m.)
Issued 19 years, 7 months ago (Jan. 1, 2006)
Published 19 years, 7 months ago (Jan. 1, 2006)
Published Print 19 years, 7 months ago (Jan. 1, 2006)
Funders 0

None

@inbook{Kocsis_2006, title={Bandit Based Monte-Carlo Planning}, ISBN={9783540460565}, ISSN={1611-3349}, url={http://dx.doi.org/10.1007/11871842_29}, DOI={10.1007/11871842_29}, booktitle={Machine Learning: ECML 2006}, publisher={Springer Berlin Heidelberg}, author={Kocsis, Levente and Szepesvári, Csaba}, year={2006}, pages={282–293} }