Please use this identifier to cite or link to this item: http://dx.doi.org/10.14279/depositonce-7113
Main Title: Automatic glossary term extraction from large-scale requirements specifications
Author(s): Gemkow, Tim
Conzelmann, Miro
Hartig, Kerstin
Vogelsang, Andreas
Type: Conference Object
Language Code: en
Abstract: Creating glossaries for large corpora of requirments is an important but expensive task. Glossary term extraction methods often focus on achieving a high recall rate and, therefore, favor linguistic proecssing for extracting glossary term candidates and neglect the benefits from reducing the number of candidates by statistical filter methods. However, especially for large datasets a reduction of the likewise large number of candidates may be crucial. This paper demonstrates how to automatically extract relevant domain-specific glossary term candidates from a large body of requirements, the CrowdRE dataset. Our hybrid approach combines linguistic processing and statistical filtering for extracting and reducing glossary term candidates. In a twofold evaluation, we examine the impact of our approach on the quality and quantity of extracted terms. We provide a ground truth for a subset of the requirements and show that a substantial degree of recall can be achieved. Furthermore, we advocate requirements coverage as an additional quality metric to assess the term reduction that results from our statistical filters. Results indicate that with a careful combination of linguistic and statistical extraction methods, a fair balance between later manual efforts and a high recall rate can be achieved.
URI: https://depositonce.tu-berlin.de//handle/11303/7951
http://dx.doi.org/10.14279/depositonce-7113
Issue Date: 2018
Date Available: 19-Jun-2018
DDC Class: 004 Datenverarbeitung; Informatik
Subject(s): requirements engineering
natural language processing
glossary term extraction
Crowd RE
License: http://rightsstatements.org/vocab/InC/1.0/
Proceedings Title: RE 2018. 26th IEEE International Requirements Engineering Conference
Publisher: IEEE
Publisher Place: New York
Appears in Collections:FG IT-basierte Fahrzeuginnovationen » Publications

Files in This Item:
File Description SizeFormat 
2018_gemkow_et-al.pdf619.14 kBAdobe PDFThumbnail
View/Open


Items in DepositOnce are protected by copyright, with all rights reserved, unless otherwise indicated.