DOI: 10.1007/11871842_29. Bandit Based Monte-Carlo Planning

Bandit Based Monte-Carlo Planning

10.1007/11871842_29

Crossref book-chapter

Springer Berlin Heidelberg

Lecture Notes in Computer Science (297)

Bibliography

Kocsis, L., & SzepesvÃ¡ri, C. (2006). Bandit Based Monte-Carlo Planning. Machine Learning: ECML 2006, 282â293.

Authors 2

Levente Kocsis (first)
Csaba Szepesvári (additional)

References 14 Referenced 1,169

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002) (10.1023/A:1013689704352) / Machine Learning by P. Auer (2002)
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM Journal on Computing 32, 48–77 (2002) (10.1137/S0097539701398375) / SIAM Journal on Computing by P. Auer (2002)
Barto, A.G., Bradtke, S.J., Singh, S.P.: Real-time learning and control using asynchronous dynamic programming. Technical report 91-57, Computer Science Department, University of Massachusetts (1991)
Billings, D., Davidson, A., Schaeffer, J., Szafron, D.: The challenge of poker. Artificial Intelligence 134, 201–240 (2002) (10.1016/S0004-3702(01)00130-8) / Artificial Intelligence by D. Billings (2002)
Bouzy, B., Helmstetter, B.: Monte Carlo Go developments. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) Advances in Computer Games 10, pp. 159–174 (2004) (10.1007/978-0-387-35706-5_11)
Chang, H.S., Fu, M., Hu, J., Marcus, S.I.: An adaptive sampling algorithm for solving Markov decision processes. Operations Research 53(1), 126–139 (2005) (10.1287/opre.1040.0145) / Operations Research by H.S. Chang (2005)
Chung, M., Buro, M., Schaeffer, J.: Monte Carlo planning in RTS games. In: CIG 2005, Colchester, UK (2005)
Kearns, M., Mansour, Y., Ng, A.Y.: A sparse sampling algorithm for near-optimal planning in large Markovian decisi on processes. In: Proceedings of IJCAI 1999, pp. 1324–1331 (1999)
Lai, T.L., Robbins, H.: Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics 6, 4–22 (1985) (10.1016/0196-8858(85)90002-8) / Advances in Applied Mathematics by T.L. Lai (1985)
Péret, L., Garcia, F.: On-line search for solving Markov decision processes via heuristic sampling. In: de Mántaras, R.L., Saitta, L. (eds.) ECAI, pp. 530–534 (2004)
Sheppard, B.: World-championship-caliber Scrabble. Artificial Intelligence 134(1–2), 241–275 (2002) (10.1016/S0004-3702(01)00166-7) / Artificial Intelligence by B. Sheppard (2002)
Smith, S.J.J., Nau, D.S.: An analysis of forward pruning. In: AAAI, pp. 1386–1391 (1994)
Tesauro, G., Galperin, G.R.: On-line policy improvement using Monte-Carlo search. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) NIPS 9, pp. 1068–1074 (1997)
Vanderbei, R.: Optimal sailing strategies, statistics and operations research program. University of Princeton (1996), http://www.sor.princeton.edu/~rvdb/sail/sail.html

Dates

Type	When
Created	18 years, 11 months ago (Sept. 18, 2006, 4:28 a.m.)
Deposited	4 years, 4 months ago (April 27, 2021, 3:24 a.m.)
Indexed	2 days, 20 hours ago (Aug. 26, 2025, 2:41 a.m.)
Issued	19 years, 7 months ago (Jan. 1, 2006)
Published	19 years, 7 months ago (Jan. 1, 2006)
Published Print	19 years, 7 months ago (Jan. 1, 2006)

Funders 0

None

BibTeX

@inbook{Kocsis_2006, title={Bandit Based Monte-Carlo Planning}, ISBN={9783540460565}, ISSN={1611-3349}, url={http://dx.doi.org/10.1007/11871842_29}, DOI={10.1007/11871842_29}, booktitle={Machine Learning: ECML 2006}, publisher={Springer Berlin Heidelberg}, author={Kocsis, Levente and Szepesvári, Csaba}, year={2006}, pages={282–293} }

JSON

{
  "indexed": {
    "date-parts": [
      [
        2025,
        8,
        26
      ]
    ],
    "date-time": "2025-08-26T06:41:11Z",
    "timestamp": 1756190471674
  },
  "publisher-location": "Berlin, Heidelberg",
  "reference-count": 14,
  "publisher": "Springer Berlin Heidelberg",
  "isbn-type": [
    {
      "type": "print",
      "value": "9783540453758"
    },
    {
      "type": "electronic",
      "value": "9783540460565"
    }
  ],
  "content-domain": {
    "domain": [],
    "crossmark-restriction": false
  },
  "published-print": {
    "date-parts": [
      [
        2006
      ]
    ]
  },
  "DOI": "10.1007/11871842_29",
  "type": "book-chapter",
  "created": {
    "date-parts": [
      [
        2006,
        9,
        18
      ]
    ],
    "date-time": "2006-09-18T08:28:47Z",
    "timestamp": 1158568127000
  },
  "page": "282-293",
  "source": "Crossref",
  "is-referenced-by-count": 1169,
  "title": "Bandit Based Monte-Carlo Planning",
  "prefix": "10.1007",
  "author": [
    {
      "given": "Levente",
      "family": "Kocsis",
      "sequence": "first",
      "affiliation": []
    },
    {
      "given": "Csaba",
      "family": "Szepesv\u00e1ri",
      "sequence": "additional",
      "affiliation": []
    }
  ],
  "member": "297",
  "reference": [
    {
      "issue": "2-3",
      "key": "29_CR1",
      "doi-asserted-by": "publisher",
      "first-page": "235",
      "DOI": "10.1023/A:1013689704352",
      "volume": "47",
      "author": "P. Auer",
      "year": "2002",
      "unstructured": "Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite time analysis of the multiarmed bandit problem. Machine Learning\u00a047(2-3), 235\u2013256 (2002)",
      "journal-title": "Machine Learning"
    },
    {
      "key": "29_CR2",
      "doi-asserted-by": "publisher",
      "first-page": "48",
      "DOI": "10.1137/S0097539701398375",
      "volume": "32",
      "author": "P. Auer",
      "year": "2002",
      "unstructured": "Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM Journal on Computing\u00a032, 48\u201377 (2002)",
      "journal-title": "SIAM Journal on Computing"
    },
    {
      "key": "29_CR3",
      "unstructured": "Barto, A.G., Bradtke, S.J., Singh, S.P.: Real-time learning and control using asynchronous dynamic programming. Technical report 91-57, Computer Science Department, University of Massachusetts (1991)"
    },
    {
      "key": "29_CR4",
      "doi-asserted-by": "publisher",
      "first-page": "201",
      "DOI": "10.1016/S0004-3702(01)00130-8",
      "volume": "134",
      "author": "D. Billings",
      "year": "2002",
      "unstructured": "Billings, D., Davidson, A., Schaeffer, J., Szafron, D.: The challenge of poker. Artificial Intelligence\u00a0134, 201\u2013240 (2002)",
      "journal-title": "Artificial Intelligence"
    },
    {
      "key": "29_CR5",
      "doi-asserted-by": "crossref",
      "unstructured": "Bouzy, B., Helmstetter, B.: Monte Carlo Go developments. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) Advances in Computer Games 10, pp. 159\u2013174 (2004)",
      "DOI": "10.1007/978-0-387-35706-5_11"
    },
    {
      "issue": "1",
      "key": "29_CR6",
      "doi-asserted-by": "publisher",
      "first-page": "126",
      "DOI": "10.1287/opre.1040.0145",
      "volume": "53",
      "author": "H.S. Chang",
      "year": "2005",
      "unstructured": "Chang, H.S., Fu, M., Hu, J., Marcus, S.I.: An adaptive sampling algorithm for solving Markov decision processes. Operations Research\u00a053(1), 126\u2013139 (2005)",
      "journal-title": "Operations Research"
    },
    {
      "key": "29_CR7",
      "unstructured": "Chung, M., Buro, M., Schaeffer, J.: Monte Carlo planning in RTS games. In: CIG 2005, Colchester, UK (2005)"
    },
    {
      "key": "29_CR8",
      "unstructured": "Kearns, M., Mansour, Y., Ng, A.Y.: A sparse sampling algorithm for near-optimal planning in large Markovian decisi on processes. In: Proceedings of IJCAI 1999, pp. 1324\u20131331 (1999)"
    },
    {
      "key": "29_CR9",
      "doi-asserted-by": "publisher",
      "first-page": "4",
      "DOI": "10.1016/0196-8858(85)90002-8",
      "volume": "6",
      "author": "T.L. Lai",
      "year": "1985",
      "unstructured": "Lai, T.L., Robbins, H.: Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics\u00a06, 4\u201322 (1985)",
      "journal-title": "Advances in Applied Mathematics"
    },
    {
      "key": "29_CR10",
      "unstructured": "P\u00e9ret, L., Garcia, F.: On-line search for solving Markov decision processes via heuristic sampling. In: de M\u00e1ntaras, R.L., Saitta, L. (eds.) ECAI, pp. 530\u2013534 (2004)"
    },
    {
      "issue": "1\u20132",
      "key": "29_CR11",
      "doi-asserted-by": "publisher",
      "first-page": "241",
      "DOI": "10.1016/S0004-3702(01)00166-7",
      "volume": "134",
      "author": "B. Sheppard",
      "year": "2002",
      "unstructured": "Sheppard, B.: World-championship-caliber Scrabble. Artificial Intelligence\u00a0134(1\u20132), 241\u2013275 (2002)",
      "journal-title": "Artificial Intelligence"
    },
    {
      "key": "29_CR12",
      "unstructured": "Smith, S.J.J., Nau, D.S.: An analysis of forward pruning. In: AAAI, pp. 1386\u20131391 (1994)"
    },
    {
      "key": "29_CR13",
      "unstructured": "Tesauro, G., Galperin, G.R.: On-line policy improvement using Monte-Carlo search. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) NIPS 9, pp. 1068\u20131074 (1997)"
    },
    {
      "key": "29_CR14",
      "unstructured": "Vanderbei, R.: Optimal sailing strategies, statistics and operations research program. University of Princeton (1996), \n                    \n                      http://www.sor.princeton.edu/~rvdb/sail/sail.html"
    }
  ],
  "container-title": "Lecture Notes in Computer Science",
  "original-title": [],
  "link": [
    {
      "URL": "http://link.springer.com/content/pdf/10.1007/11871842_29.pdf",
      "content-type": "unspecified",
      "content-version": "vor",
      "intended-application": "similarity-checking"
    }
  ],
  "deposited": {
    "date-parts": [
      [
        2021,
        4,
        27
      ]
    ],
    "date-time": "2021-04-27T07:24:09Z",
    "timestamp": 1619508249000
  },
  "score": 1,
  "resource": {
    "primary": {
      "URL": "http://link.springer.com/10.1007/11871842_29"
    }
  },
  "subtitle": [],
  "short-title": [],
  "issued": {
    "date-parts": [
      [
        2006
      ]
    ]
  },
  "ISBN": [
    "9783540453758",
    "9783540460565"
  ],
  "references-count": 14,
  "URL": "http://dx.doi.org/10.1007/11871842_29",
  "relation": {},
  "ISSN": [
    "0302-9743",
    "1611-3349"
  ],
  "subject": [],
  "published": {
    "date-parts": [
      [
        2006
      ]
    ]
  }
}