Please use this identifier to cite or link to this item: http://dx.doi.org/10.14279/depositonce-10532
For citation please use:
Main Title: Dari Dataset for Coreference Resolution
Author(s): Zia, Ghezal Ahmad Jan
Other Contributor(s): Amini, Fazel
Type: Generic Research Data
References: 10.14279/depositonce-10447
10.14279/depositonce-10413
10.14279/depositonce-10437
http://dx.doi.org/10.14279/depositonce-10420
Language Code: en
Abstract: DariCoref, a Dari corpus annotated for anaphoric relations, where all documents are collected from Dari VOA and Azadi Radio. The annotation scheme follows the OntoNotes and WikiCoref. Each markable annotated with coreference type (Identical, Attributive, and Copular), and mention type (Named Entity, Noun Phrase, and Pronominal). Since this is the first annotation efforts concentrate on very specific types of written text, mainly newswire, there is a lack of resources for Dari texts. Therefore, we present a freely available resource we devised for the task of coreference resolution algorithms dedicated to Dari texts. The annotation has been processed by MMAX2 tool.
URI: https://depositonce.tu-berlin.de/handle/11303/11644
http://dx.doi.org/10.14279/depositonce-10532
Issue Date: 8-Sep-2020
Date Available: 14-Sep-2020
DDC Class: 000 informatics, information science, general works
Subject(s): DariCoref
Dari NLP Resources
Dari Coreference Resolution Dataset
License: https://choosealicense.com/licenses/gpl-3.0/
Appears in Collections:FG Modelle und Theorie Verteilter Systeme » Research Data

Files in This Item:
DariCoref.zip

DariCoref, a Dari corpus annotated for anaphoric relations, where all documents are collected from Azadi Radio and Dari VOA. The annotation scheme follows the OntoNotes and WikiCoref..

Format: ZIP Archive | Size: 7.44 MB
Download
Description.txt
Format: Text | Size: 672 B
Download

Item Export Bar

Items in DepositOnce are protected by copyright, with all rights reserved, unless otherwise indicated.