Neural sequential transfer learning for relation extraction

Alt, Christoph Benedikt

Neural sequential transfer learning for relation extraction

dc.contributor.advisor	Möller, Sebastian
dc.contributor.author	Alt, Christoph Benedikt
dc.contributor.grantor	Technische Universität Berlin	en
dc.contributor.referee	Möller, Sebastian
dc.contributor.referee	Uszkoreit, Hans
dc.contributor.referee	Akbik, Alan
dc.date.accepted	2020-11-30
dc.date.accessioned	2021-01-20T09:02:45Z
dc.date.available	2021-01-20T09:02:45Z
dc.date.issued	2021
dc.description.abstract	Relation extraction (RE) is concerned with developing methods and models that automatically detect and retrieve relational information from unstructured data. It is crucial to information extraction (IE) applications that aim to leverage the vast amount of knowledge contained in unstructured natural language text, for example, in web pages, online news, and social media; and simultaneously require the powerful and clean semantics of structured databases instead of searching, querying, and analyzing unstructured text directly. In practical applications, however, relation extraction is often characterized by limited availability of labeled data, due to the cost of annotation or scarcity of domain-specific resources. In such scenarios it is difficult to create models that perform well on the task. It therefore is desired to develop methods that learn more efficiently from limited labeled data and also exhibit better overall relation extraction performance, especially in domains with complex relational structure. In this thesis, I propose to use transfer learning to address this problem, i.e., to reuse knowledge from related tasks to improve models, in particular, their performance and efficiency to learn from limited labeled data. I show how sequential transfer learning, specifically unsupervised language model pre-training, can improve performance and sample efficiency in supervised and distantly supervised relation extraction. In the light of improved modeling abilities, I observe that better understanding neural network-based relation extraction methods is crucial to gain insights that further improve their performance. I therefore present an approach to uncover the linguistic features of the input that neural RE models encode and use for relation prediction. I further complement this with a semi-automated analysis approach focused on model errors, datasets, and annotations. It effectively highlights controversial examples in the data for manual evaluation and allows to specify error hypotheses that can be verified automatically. Together, the researched approaches allow us to build better performing, more sample efficient relation extraction models, and advance our understanding despite their complexity. Further, it facilitates more comprehensive analyses of model errors and datasets in the future.	en
dc.description.abstract	Relationsextraktion (RE) befasst sich mit der Entwicklung von Methoden, die relationale Informationen in unstrukturierten Daten automatisch erkennen und abrufen können. Sie ist von entscheidender Bedeutung für Anwendungen der Informationsextraktion (IE), die darauf abzielen große Mengen an Wissen in unstrukturiertem natürlichsprachigem Text, z.B. in Webseiten und sozialen Medien, zu nutzen und gleichzeitig die leistungsfähige und klare Semantik strukturierter Datenbanken benötigen; statt unstrukturierten Text direkt zu durchsuchen, abzufragen und zu analysieren. In der Praxis ist die Anwendung von RE jedoch problematisch: Annotationskosten und Knappheit domänenspezifischer Ressourcen resultieren oft in einer begrenzten Verfügbarkeit von überwachten Daten. In solchen Szenarien ist es schwierig Modelle zu erstellen, die diese Aufgabe effektiv lösen können. Daher ist es wünschenswert Methoden zu entwickeln, die effizienter aus wenigen überwachten Daten lernen und eine bessere RE Gesamtperformanz aufweisen, besonderes in Domänen mit komplexer relationaler Struktur. In dieser Dissertation schlage ich vor hierfür Transferlernen zu verwenden, d.h. erlerntes Wissen aus verwandten Aufgaben wiederzuverwenden um Modelle zu verbessern, speziell ihre Performanz und Effizienz aus wenigen überwachten Daten zu lernen. Ich zeige wie sequentielles Transferlernen, insbesondere unüberwachtes Sprachmodel-Vortraining, die Leistung und Dateneffizienz überwachter und distanzüberwachter RE verbessern kann. Angesichts der verbesserten Modellierungsfähigkeiten ist ein besseres Verständnis der auf neuronalen Netzen basierenden RE Methoden entscheidend um neue Erkenntnisse zu gewinnen, die ihre Leistung weiter verbessern. Hierzu stelle ich einen Ansatz vor um linguistische Merkmale der Eingabetexte aufzudecken, die von Modellen kodiert und für die Relationsvorhersage verwendet werden. Des Weiteren ergänze ich dies durch einen halbautomatischen Analyseansatz, der sich auf Modellfehler, Datensätze und Annotationen konzentriert. Zusammen erlauben es die erforschten Ansätze, leistungsfähigere und effizientere RE Modelle zu erstellen und unser Verständnis trotz ihrer Komplexität zu verbessern. Darüber hinaus erleichtert es in Zukunft umfassendere Analysen von Modellfehlern und Datensätzen.	de
dc.identifier.uri	https://depositonce.tu-berlin.de/handle/11303/12278
dc.identifier.uri	http://dx.doi.org/10.14279/depositonce-11154
dc.language.iso	en	en
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en
dc.subject.ddc	004 Datenverarbeitung; Informatik	de
dc.subject.other	information extraction	en
dc.subject.other	relation extraction	en
dc.subject.other	transfer learning	en
dc.subject.other	text mining	en
dc.subject.other	natural language processing	en
dc.subject.other	Informationsextraktion	de
dc.subject.other	Relationsextraktion	de
dc.subject.other	Transferlernen	de
dc.subject.other	Textanalyse	de
dc.subject.other	Sprachverarbeitung	de
dc.title	Neural sequential transfer learning for relation extraction	en
dc.title.translated	Neuronales sequentielles Transferlernen für Relationsextraktion	de
dc.type	Doctoral Thesis	en
dc.type.version	acceptedVersion	en
tub.accessrights.dnb	free	en
tub.affiliation	Fak. 4 Elektrotechnik und Informatik::Inst. Softwaretechnik und Theoretische Informatik::Quality and Usability Lab	de
tub.affiliation.faculty	Fak. 4 Elektrotechnik und Informatik	de
tub.affiliation.group	Quality and Usability Lab	de
tub.affiliation.institute	Inst. Softwaretechnik und Theoretische Informatik	de
tub.publisher.universityorinstitution	Technische Universität Berlin	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: alt_christoph_benedikt.pdf
Size:: 2.31 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 4.9 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Publications