Reinforcement Learning: the way machines learn
A hacking day of Mathematics for Data Science at the Department of Mathematics

Bio
I am a Professor in Mathematics and I teach Mathematics and Machine Learning at the University of ChietiPescara. My main research topic is Differential Geometry, but when AlphaGo first appeared I was fascinated by the potential of Artificial Intelligence. I have delved deeply in the theory of Neural Networks and Reinforcement Learning, and started working on many research projects applying AI to games, computer science and biology. I am deeply convinced that Artificial Intelligence will mark our (very) near future, and for this reason I am passionately dedicated to the dissemination of its fundamental principles and basic techniques.Syllabus
Reinforcement Learning (RL) is a machine learning technique where an agent learns to solve a decision problem by performing actions and assessing their results. RL has been acknowledged as a breakthrough technology by MIT in 2017. We will study the fundamentals of RL and will sketch the latest methods used to solve a variety of complex tasks, from gaming to computer science, finance, and robotics. This is a 12h crash course intended for students with a background in probability. The course is comprised of theory, applications and assignments. Some experience with Python may prove useful to full profit from the exercises and assignments.
References
We will follow the textbook Reinforcement Learning: An Introduction, second edition, by Richard S. Sutton and Andrew G. Barto and the video lectures By David Silver, DeepMind
Schedule
 Wednesday 27 January 2021 @ 14.0017.00
 Friday 29 January 2021 @ 14.0017.00
 Wednesday 03 February 2021 @ 14.0017.00
 Friday 05 February 2021 @ 14.0017.00
Details
 Participation is free, however a notification by email to Prof. Luigi Amedeo Bianchi is mandatory
 For further information, please contact Prof. Luigi Amedeo Bianchi
 Venue: Webinar, credentials will be sent to the participants the day before of the event
 Language: English
Material (Restricted access, user: RL2021)
Further References
 R.S. Sutton and A.G. Barto (2020) Reinforcement Learning: An Introduction, second edition

spinningup.openai an educational resource produced by OpenAI that makes it easier to learn about deep reinforcement learning (deep RL)
 Algorithms and fundamental of convergence proofs
 P. Dayan (1992) The convergence of TD\((\lambda)\) for general $\lambda$, Machine Learning, 8, 341362
 C. J.C.H. Watkins and P. Dayan (1992) Technical Note QLearning, Machine Learning, 8, 279292
 P. Dayan and T.J. Sejnowski (1994) TD\((\lambda)\) Converges with Probability 1, Machine Learning, 14, 295301
 R.S. Sutton (1988) Learning to Predict by the Methods of Temporal Differences, Machine Learning 3, 944
 S. Singh, T. Jaakkola, M.L. Littman and C. Szepesvari (2000) Convergence Results for SingleStep OnPolicy ReinforcementLearning Algorithms, Machine Learning, 39, 287308
 J.N. Tsitsiklis (2002) On the Convergence of Optimistic Policy Iteration, Journal of Machine Learning Research, 3, 5972 (annotation in xopp)
 Environments
 M. Hoffman (2020) Acme: A Research Framework for Distributed Reinforcement Learning, DeepMind
 Survey
 C. Szepesvári (2010) Algorithms for Reinforcement Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning, Morgan & Claypool Publishers
 C.B. Browne, D. Whitehouse, P.I. Cowling and S. Samothrakis (2012) A Survey of Monte Carlo Tree Search Methods, IEEE Transactions on computational intelligence and AI in games, 4(1)
 A. Slivkins (2019) Introduction to MultiArmed Bandits, arXiv:1904.07272