RUTCOR Colloquia - November 2, 2006

Home | Seminars | Colloquia | Brown Bag | Contact Info

	RUTCOR Colloquia - November 2, 2006 Speaker: Uri Rothblum Affiliation: Technion, Israel Institute of Technology Title: Risk-sensitive and risk-neutral multi-armed bandits. Time: 2:00 - 3:00 PM Location: RUTCOR Building - Room 139, Rutgers University, Busch Campus, Piscataway, NJ Abstract: The classic result for the multi-armed bandit is that each state of each bandit (Markov chain with rewards) has an index that depends only on the data for its bandit, and expected discounted income is maximized by playing at each epoch a bandit whose current state has the largest index. We extend this result to the cases of risk-averse and risk-seeking exponential utility functions, while generalizing it somewhat for the risk-neutral case. Our approach is constructive. It uses domination in place of indices. A simple recursion assigns to each state of each bandit a reward and an amplification that depend solely on the data for its bandit. These rewards and amplifications determine a ranking of the states, and we show that it is optimal to play at each epoch any bandit whose state has the highest ranking. The algorithm we develop is simpler than previous algorithms that determine indices. (Joint work with Eric V. Denardo, Haechurl Park) Back to Seminars Page. Back to RUTCOR homepage.
BACK TO TOP

For questions or comments about this site, contact webmaster@rutcor.rutgers.edu Last updated: 03/07/06 © 2004 Rutgers, The State University of New Jersey. All rights reserved.