A Gently but rigorous introduction to reinforcement learning
DOI:
https://doi.org/10.32870/recibe.v12i1.268Keywords:
Markov decision processAbstract
The interaction within the world constitutes one of the main ways in which learning is generated, as it is the way by which we obtain information from the environment and we experience cause-effect relationships. This idea of learning through interaction is a fundamental issue in many learning theories and, in this paper, we will address a computational approach called Reinforcement Learning (RL) and we will build in a progressive and simple way its mathematical basis, as well as its main solution methods. Lastly, applications and algorithms that are relevant in the industry and research are presented.References
Bertsekas, D. (2012). Dynamic programming and optimal control: Volume i (Vol. 1). Athena scientific.
Elahi, E. (2022). Reinforcement learning for budget constrained recommendations. Retrieved
January 2023, from https://netflixtechblog.com/reinforcement-learning-for-budget-constrained-recommendations-6cbc5263a32a
Fawzi, A., Balog, M., Huang, A., Hubert, T., Romera-Paredes, B., Barekatain, M., ... others
(2022). Discovering faster matrix multiplication algorithms with reinforcement learning. Nature, 610(7930), 47–53.
Luo, J., Paduraru, C., Voicu, O., Chervonyi, Y., Munns, S., Li, J., ... others (2022). Controlling
commercial cooling systems using reinforcement learning. arXiv preprint arXiv:2211.07357.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M.
(2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., ... others
(2016). Mastering the game of go with deep neural networks and tree search. nature,
(7587), 484–489.
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., ... others (2018). A general
reinforcement learning algorithm that masters chess, shogi, and go through self-play.
Science, 362(6419), 1140–1144.
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.
Tesauro, G. (1995). Td-gammon: A self-teaching backgammon program. Applications of neural
networks, 267–285