XAI and Strategy Extraction via Reward Redistribution

Activity: Talk or presentationContributed talkscience-to-science

Description

Assigning credit for a received reward to previously performed actions is one of the central tasks in reinforcement learning. Credit assignment often uses world models, either in a forward or in a backward view. In a forward view, the future return is estimated by replacing the environment through a model or by rolling out sequences until episode end. A backward view either learns a backward model or performs a backward analysis of a forward model that predicts or models the return of an episode. Our method RUDDER performs a backward analysis to construct a reward redistribution to credit those actions that caused a reward. Its extension Align-RUDDER learns a reward redistribution from few demonstrations. An optimal reward redistribution has zero expected future reward and, therefore, immediately credits actions for all they will cause. XAI aims at credit assignment, too, when asking what caused a model to produce a particular output given an input. Even further, XAI wants to know how and why a policy solved a task, why an agent is better than humans, why a decision was made. Humans best comprehend a strategy of an agent if all its actions are immediately evaluated and do not have hidden consequences in the future. Reward redistributions learned by RUDDER and Align-RUDDER help to understand task-solving strategies of both humans and machines.
Period18 Jul 2020
Event titleICML 2020
Event typeConference
LocationAustriaShow on map

Fields of science

  • 101031 Approximation theory
  • 102 Computer Sciences
  • 305901 Computer-aided diagnosis and therapy
  • 102033 Data mining
  • 102032 Computational intelligence
  • 101029 Mathematical statistics
  • 102013 Human-computer interaction
  • 305905 Medical informatics
  • 101028 Mathematical modelling
  • 101027 Dynamical systems
  • 101004 Biomathematics
  • 101026 Time series analysis
  • 202017 Embedded systems
  • 101024 Probability theory
  • 305907 Medical statistics
  • 102019 Machine learning
  • 202037 Signal processing
  • 102018 Artificial neural networks
  • 103029 Statistical physics
  • 202036 Sensor systems
  • 202035 Robotics
  • 106005 Bioinformatics
  • 106007 Biostatistics
  • 101019 Stochastics
  • 101018 Statistics
  • 101017 Game theory
  • 101016 Optimisation
  • 102001 Artificial intelligence
  • 101015 Operations research
  • 102004 Bioinformatics
  • 101014 Numerical mathematics
  • 102003 Image processing

JKU Focus areas

  • Digital Transformation