On the Validity of Benchmarking for Evaluating Code Quality

Harald Gruber (Editor), Reinhold Plösch (Editor), Matthias Saft

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

Evaluating source code quality is a laborious task when performed by experts. There exist a number of approaches that try to provide an automatic assessment. Absolute quality assessment methods (e.g. based on thresholds) did not yet proof the significance of the results. Software benchmarking is a relative assessment approach based on the general idea of benchmarking from other industries. We developed a benchmarking-oriented code quality assessment method that at least overcomes known technical problems of other benchmark based methods. Nevertheless, the major concern is to validate the significance of benchmarking based results by comparing these results with the quality assessments of experts. For this purpose we conducted two studies (one for Java projects, the other for C# projects) involving a total of 10 open source projects and 22 experts. While the first experiment with Java projects has provided a result that has motivated us to use the benchmarking-oriented assessment more intensively, the experiment with C# showed us that we cannot trust the results of the automatic benchmark assessment blindly.
Original languageEnglish
Title of host publicationProceedings of the joined International Conferences on Software Measurement IWSM/MetriKon/Mensura 2010, Stuttgart, Germany, November 10-12, Shaker Verlag, Aachen, 2010
Place of PublicationAachen
PublisherShaker
Number of pages10
DOIs
Publication statusPublished - Nov 2010

Fields of science

  • 102 Computer Sciences
  • 102009 Computer simulation
  • 102015 Information systems
  • 102026 Virtual reality

Cite this