Subscribe for talk infomation
Email:


Index

  • 2015-2016 Program
  • 2014-2015 Program
  • 2013-2014 Program
  • 2012-2013 Program
  • 2011-2012 Program
  • Prof. Benjamin Van Roy (Stanford)

    Sampling Methods that Learn to Optimize
    Date: Mar. 7, 2016.
    Time: 1:00 pm - 2:00 pm.
    Place: Shannon room (Room 54-134 Engr IV).

    Abstract: The information revolution is spawning systems that require very frequent decisions and provide high volumes of data concerning past outcomes. Fueling the design of algorithms used in such systems is a vibrant research area at the intersection of sequential decision-making and machine learning that addresses how to balance between exploration and exploitation in order to efficiently learn over time to make increasingly effective decisions. In this talk, I will formulate a broad family of such problems that greatly extends the classical multi-armed bandit problem by allowing samples of one action to inform the decision-maker's assessment of other actions. I will then describe the rising importance of this problem class and two algorithms: Thompson sampling, which has recently been the focus of much attention in academia and industry, and information-directed sampling, a recent development inspired by a fresh information-theoretic perspective. I will also discuss a variation of Thompson sampling that enables temporally extended (or “deep”) exploration, which leads to exponentially faster learning in complex dynamic environments, and how this substantially improves learning times and ultimate performance achieved by state-of-the-art deep reinforcement learning algorithms across most Atari games.

    Short Bio: Benjamin Van Roy is a Professor of Electrical Engineering, Management Science and Engineering, and, by courtesy, Computer Science, at Stanford University. His research focuses on understanding how an agent interacting with a poorly understood environment can learn over time to make effective decisions. He is an INFORMS Fellow and has served on editorial boards and program committees of a number of operations research and machine learning journals and conferences. He has also founded and/or led research programs at several technology companies. He received the SB in Computer Science and Engineering and the SM and PhD in Electrical Engineering and Computer Science, all from MIT.