TY - UNPB
T1 - Teaching drones on-the-fly: Can emotional feedback serve as learning signal for training artificial agents?
AU - Pollak, Manuela
AU - Salfinger, Andrea
AU - Hummel, Karin Anna
PY - 2022/2
Y1 - 2022/2
N2 - We investigate whether naturalistic emotional human feedback can be directly exploited as a reward signal for training artificial agents via interactive human-in-the-loop reinforcement learning. To answer this question, we devise an experimental setting inspired by animal training, in which human test subjects interactively teach an emulated drone agent their desired command-action-mapping by providing emotional feedback on the drone's action selections. We present a first empirical proof-of-concept study and analysis confirming that human facial emotion expression can be directly exploited as reward signal in such interactive learning settings. Thereby, we contribute empirical findings towards more naturalistic and intuitive forms of reinforcement learning especially designed for non-expert users.
AB - We investigate whether naturalistic emotional human feedback can be directly exploited as a reward signal for training artificial agents via interactive human-in-the-loop reinforcement learning. To answer this question, we devise an experimental setting inspired by animal training, in which human test subjects interactively teach an emulated drone agent their desired command-action-mapping by providing emotional feedback on the drone's action selections. We present a first empirical proof-of-concept study and analysis confirming that human facial emotion expression can be directly exploited as reward signal in such interactive learning settings. Thereby, we contribute empirical findings towards more naturalistic and intuitive forms of reinforcement learning especially designed for non-expert users.
UR - https://arxiv.org/abs/2202.09634
U2 - 10.48550/arXiv.2202.09634
DO - 10.48550/arXiv.2202.09634
M3 - Preprint
T3 - arXiv.org
SP - 1
EP - 6
BT - Teaching drones on-the-fly: Can emotional feedback serve as learning signal for training artificial agents?
ER -