Dari Dataset for Coreference Resolution

dc.contributor.authorZia, Ghezal Ahmad Jan
dc.contributor.otherAmini, Fazel
dc.date.accessioned2020-09-14T06:33:20Z
dc.date.available2020-09-14T06:33:20Z
dc.date.issued2020-09-08
dc.description.abstractDariCoref, a Dari corpus annotated for anaphoric relations, where all documents are collected from Dari VOA and Azadi Radio. The annotation scheme follows the OntoNotes and WikiCoref. Each markable annotated with coreference type (Identical, Attributive, and Copular), and mention type (Named Entity, Noun Phrase, and Pronominal). Since this is the first annotation efforts concentrate on very specific types of written text, mainly newswire, there is a lack of resources for Dari texts. Therefore, we present a freely available resource we devised for the task of coreference resolution algorithms dedicated to Dari texts. The annotation has been processed by MMAX2 tool.en
dc.identifier.urihttps://depositonce.tu-berlin.de/handle/11303/11644
dc.identifier.urihttp://dx.doi.org/10.14279/depositonce-10532
dc.language.isoenen
dc.relation.references10.14279/depositonce-10447
dc.relation.references10.14279/depositonce-10413
dc.relation.references10.14279/depositonce-10437
dc.relation.referenceshttp://dx.doi.org/10.14279/depositonce-10420
dc.rights.urihttps://choosealicense.com/licenses/gpl-3.0/en
dc.subject.ddc000 informatics, information science, general worksde
dc.subject.otherDariCorefen
dc.subject.otherDari NLP Resourcesen
dc.subject.otherDari Coreference Resolution Dataseten
dc.titleDari Dataset for Coreference Resolutionen
dc.typeTextual Dataen
tub.accessrights.dnbunknown*
tub.affiliationFak. 4 Elektrotechnik und Informatik::Inst. Softwaretechnik und Theoretische Informatik::FG Modelle und Theorie Verteilter Systemede
tub.affiliation.facultyFak. 4 Elektrotechnik und Informatikde
tub.affiliation.groupFG Modelle und Theorie Verteilter Systemede
tub.affiliation.instituteInst. Softwaretechnik und Theoretische Informatikde

Files

Original bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
DariCoref.zip
Size:
7.27 MB
Format:
ZIP archive format.
Description:
DariCoref, a Dari corpus annotated for anaphoric relations, where all documents are collected from Azadi Radio and Dari VOA. The annotation scheme follows the OntoNotes and WikiCoref..
No Thumbnail Available
Name:
Description.txt
Size:
672 B
Format:
Plain Text
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections