A Review of Possible Effects of Cognitive Biases on Interpretation of Rule-based Machine Learning Models

Tomáš Kliegr, Stepan Bahnik, Johannes Fürnkranz

Research output: Contribution to journalArticlepeer-review

Abstract

While the interpretability of machine learning models is often equated with their mere syntactic comprehensibility, we think that interpretability goes beyond that, and that human interpretability should also be investigated from the point of view of cognitive science. In particular, the goal of this paper is to discuss to what extent cognitive biases may affect human understanding of interpretable machine learning models, in particular of logical rules discovered from data. Twenty cognitive biases are covered, as are possible debiasing techniques that can be adopted by designers of machine learning algorithms and software. Our review transfers results obtained in cognitive psychology to the domain of machine learning, aiming to bridge the current gap between these two areas. It needs to be followed by empirical studies specifically focused on the machine learning domain.
Original languageEnglish
Article number103458
Pages (from-to)103458
Number of pages33
JournalArtificial Intelligence
Volume295
DOIs
Publication statusPublished - 2021

Fields of science

  • 102001 Artificial intelligence
  • 102019 Machine learning
  • 102033 Data mining
  • 501011 Cognitive psychology
  • 501030 Cognitive science

JKU Focus areas

  • Digital Transformation

Cite this