Abstract
this stage development of recommender systems (RS),
an evaluation of competing approaches (methods) yielding
similar performances in terms of experiment reproduction is
of crucial importance in order to direct the further development
toward the most promising direction. These comparisons
are usually based on the 10-fold cross validation
scheme. Since the compared performances are often similar
to each other, the application of statistical significance
testing is inevitable in order to not to get misled by randomly
caused differences of achieved performances. For the
same reason, to reproduce experiments on a different set
of experimental data, the most powerful significance testing
should be applied. In this work we provide guidelines on how
to achieve the highest power in the comparison of RS and
we demonstrate them on a comparison of RS performances
when different variables are contextualized.
Original language | English |
---|---|
Title of host publication | Proceedings of the International Workshop on Reproducibility and Replication in Recommender Systems Evaluation - RepSys ’13, 3–6. |
Number of pages | 4 |
Publication status | Published - 2013 |
Fields of science
- 202002 Audiovisual media
- 102 Computer Sciences
- 102001 Artificial intelligence
- 102003 Image processing
- 102015 Information systems
JKU Focus areas
- Computation in Informatics and Mathematics
- Engineering and Natural Sciences (in general)