Reinforcement Learning: the way machines learn
A hacking day of Mathematics for Data Science at the Department of Mathematics
|
Bio
I am a Professor in Mathematics and I teach Mathematics and Machine Learning at the University of Chieti-Pescara. My main research topic is Differential Geometry, but when AlphaGo first appeared I was fascinated by the potential of Artificial Intelligence. I have delved deeply in the theory of Neural Networks and Reinforcement Learning, and started working on many research projects applying AI to games, computer science and biology. I am deeply convinced that Artificial Intelligence will mark our (very) near future, and for this reason I am passionately dedicated to the dissemination of its fundamental principles and basic techniques.Syllabus
Reinforcement Learning (RL) is a machine learning technique where an agent learns to solve a decision problem by performing actions and assessing their results. RL has been acknowledged as a breakthrough technology by MIT in 2017. We will study the fundamentals of RL and will sketch the latest methods used to solve a variety of complex tasks, from gaming to computer science, finance, and robotics. This is a 12h crash course intended for students with a background in probability. The course is comprised of theory, applications and assignments. Some experience with Python may prove useful to full profit from the exercises and assignments.
References
We will follow the textbook Reinforcement Learning: An Introduction, second edition, by Richard S. Sutton and Andrew G. Barto and the video lectures By David Silver, DeepMind
Schedule
- Wednesday 27 January 2021 @ 14.00-17.00
- Friday 29 January 2021 @ 14.00-17.00
- Wednesday 03 February 2021 @ 14.00-17.00
- Friday 05 February 2021 @ 14.00-17.00
Details
- Participation is free, however a notification by email to Prof. Luigi Amedeo Bianchi is mandatory
- For further information, please contact Prof. Luigi Amedeo Bianchi
- Venue: Webinar, credentials will be sent to the participants the day before of the event
- Language: English
Material (Restricted access, user: RL2021)
Further References
- R.S. Sutton and A.G. Barto (2020) Reinforcement Learning: An Introduction, second edition
-
spinningup.openai an educational resource produced by OpenAI that makes it easier to learn about deep reinforcement learning (deep RL)
- Algorithms and fundamental of convergence proofs
- P. Dayan (1992) The convergence of TD\((\lambda)\) for general $\lambda$, Machine Learning, 8, 341-362
- C. J.C.H. Watkins and P. Dayan (1992) Technical Note Q-Learning, Machine Learning, 8, 279-292
- P. Dayan and T.J. Sejnowski (1994) TD\((\lambda)\) Converges with Probability 1, Machine Learning, 14, 295-301
- R.S. Sutton (1988) Learning to Predict by the Methods of Temporal Differences, Machine Learning 3, 9-44
- S. Singh, T. Jaakkola, M.L. Littman and C. Szepesvari (2000) Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms, Machine Learning, 39, 287-308
- J.N. Tsitsiklis (2002) On the Convergence of Optimistic Policy Iteration, Journal of Machine Learning Research, 3, 59-72 (annotation in xopp)
- Environments
- M. Hoffman (2020) Acme: A Research Framework for Distributed Reinforcement Learning, DeepMind
- Survey
- C. Szepesvári (2010) Algorithms for Reinforcement Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning, Morgan & Claypool Publishers
- C.B. Browne, D. Whitehouse, P.I. Cowling and S. Samothrakis (2012) A Survey of Monte Carlo Tree Search Methods, IEEE Transactions on computational intelligence and AI in games, 4(1)
- A. Slivkins (2019) Introduction to Multi-Armed Bandits, arXiv:1904.07272