IISER Pune
INDIAN INSTITUTE OF SCIENCE EDUCATION AND RESEARCH (IISER) PUNE
where tomorrow’s science begins today
An Autonomous Institution, Ministry of Human Resource Development, Govt. of India
Links
Seminars and Colloquia

Mathematics

Concentration Bounds for Stochastic Approximation with Applications to Reinforcement Learning 
 
Mon, Nov 13, 2017,   11:30 AM to 12:30 PM at Madhava Hall

Dr. Gugan Thoppe
Technion, Israel

Stochastic Approximation (SA) is useful when the aim is to find optimal points, or zeros of a function, given only its noisy estimates. In this talk, we will review our recent advances in techniques for SA analysis. This talk has four major parts. In the first part, we will see a motivating application of SA to network tomography. Also, we shall discuss the convergence of a novel stochastic Kaczmarz method. In the second part, we shall discuss a novel tool based on the Alekseev's formula to obtain the rate of convergence of a nonlinear SA to a specific solution, given that there are multiple locally stable solutions. In the third part, we shall extend the previous tool to the two timescale but linear SA setting. We shall also discuss how this tool applies to gradient Temporal Difference (TD) methods such as GTD(0), GTD2, and TDC used in reinforcement learning. For the analyses in the second and third parts to hold, the initial step size must be chosen sufficiently small, depending on unknown problem-dependent parameters. This is often impractical. In the fourth part, we shall discuss a trick to obviate this in context of the one timescale, linear TD(0) method. We strongly believe that this trick is generalizable. We also provide here a novel expectation bound. We shall end with some future directions.

homecolloquia_seminars