Reward-based learning and decision making: Experiments & models
Reinforcement learning provides a framework for making agents learn policies through feedback signals (“rewards”) which provide information about whether their actions or action sequences were successful or not. Reinforcement learning also provides a framework for understanding, how humans learn and decide given reward information only. Standard reinforcement learning assumes that good decisions/actions/policies are the ones which maximize expected reward as a proxy of success. Humans and animals, on the other hand, often do not behave this way, and there is ample evidence for multiple reward-based learning systems as well as for multiple factors influencing learning and decision making. In my talk I will specifically discuss two additional sources influencing reward based learning and decision making in human subjects: multiple prediction errors related to one learning task and the interaction between risk and reward. For the latter I will present a new mathematical framework for including risk into reinforcement learning on Markov decision processes, and I will derive a risk-sensitive variant of model-free Q-learning. Possible extensions to the partially observable case will be discussed.