Many software traceability techniques have been developed in the past decade, but suffer from inaccuracy. To address this shortcoming, the software traceability research community seeks to employ benchmarking. Benchmarking will help the community agree on whether improvements to traceability techniques have addressed the challenges faced by the research community. A plethora of evaluation methods have been applied, with no consensus on what should be part of a community benchmark. The goals of this paper are: to identify recurring problems in evaluation of traceability techniques, to identify essential properties that evaluation methods should possess to overcome the identified problems, and to provide guidelines for benchmarking software traceability techniques. We illustrate the properties and guidelines using empirical evaluation of three software traceability techniques on nine data sets. The proposed benchmarking framework can be broadly applied to domains beyond traceability research.
Shin, Yonghee; Hayes, Jane Huffman; and Cleland-Huang, Jane, "A Framework for Evaluating Traceability Benchmark Metrics" (2012). Technical Reports. 21.