There is a newer version of this item.
Optimizing for Recall in Automatic Requirements Classification: An Empirical Study
Using Machine Learning to solve requirements engineering problems can be a tricky task. Even though certain algorithms have exceptional performance, their recall is usually below 100%. One key aspect in the implementation of machine learning tools is the balance between recall and precision. Tools that do not find all correct answers may be considered useless. However, some tasks are very complicated and even requirements engineers struggle to solve them perfectly. If a tool achieves performance comparable to a trained engineer while reducing her workload considerably, it is considered to be useful. One such task is the classification of specification content elements into requirements and non-requirements. In this paper, we analyze this specific requirements classification problem and assess the importance of recall by performing an empirical study. We compared two groups of students who performed this task with and without tool support, respectively. We use the results to compute an estimate of β for the Fβ score, allowing us to choose the optimal balance between precision and recall. Furthermore, we use the results to assess the practical time savings realized by the approach. By using the tool, users may not be able to find all defects in a document, however, they will be able to find close to all of them in a fraction of the time necessary. This demonstrates the practical usefulness of our approach and machine learning tools in general.
Published in: 27th IEEE International Requirements Engineering Conference (RE'19), Institute of Electrical and Electronics Engineers (IEEE)