Una Introducción amable pero riguroza al aprendizaje por refuerzo

Mauro Alejandro Montenegro Meza; Rolando Menchaca Méndez; Ricardo Menchaca Méndez

doi:10.32870/recibe.v12i1.268

Authors

Mauro Alejandro Montenegro Meza Centro de Investigación en Computación https://orcid.org/0000-0002-4763-0718
Rolando Menchaca Méndez Centro de Investigación en Computación del IPN https://orcid.org/0000-0001-6733-9445
Ricardo Menchaca Méndez Centro de Investigación en Computación del IPN

DOI:

https://doi.org/10.32870/recibe.v12i1.268

Keywords:

Markov decision process

Abstract

The interaction within the world constitutes one of the main ways in which learning is generated, as it is the way by which we obtain information from the environment and we experience cause-effect relationships. This idea of learning through interaction is a fundamental issue in many learning theories and, in this paper, we will address a computational approach called Reinforcement Learning (RL) and we will build in a progressive and simple way its mathematical basis, as well as its main solution methods. Lastly, applications and algorithms that are relevant in the industry and research are presented.

References

Bertsekas, D. (2012). Dynamic programming and optimal control: Volume i (Vol. 1). Athena scientific.

Elahi, E. (2022). Reinforcement learning for budget constrained recommendations. Retrieved

January 2023, from https://netflixtechblog.com/reinforcement-learning-for-budget-constrained-recommendations-6cbc5263a32a

Fawzi, A., Balog, M., Huang, A., Hubert, T., Romera-Paredes, B., Barekatain, M., ... others

(2022). Discovering faster matrix multiplication algorithms with reinforcement learning. Nature, 610(7930), 47–53.

Luo, J., Paduraru, C., Voicu, O., Chervonyi, Y., Munns, S., Li, J., ... others (2022). Controlling

commercial cooling systems using reinforcement learning. arXiv preprint arXiv:2211.07357.

Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M.

(2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.

Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., ... others

(2016). Mastering the game of go with deep neural networks and tree search. nature,

(7587), 484–489.

Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., ... others (2018). A general

reinforcement learning algorithm that masters chess, shogi, and go through self-play.

Science, 362(6419), 1140–1144.

Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.

Tesauro, G. (1995). Td-gammon: A self-teaching backgammon program. Applications of neural

networks, 267–285

A Gently but rigorous introduction to reinforcement learning

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Language

Make a Submission