Dari Dataset for Coreference Resolution
dc.contributor.author | Zia, Ghezal Ahmad Jan | |
dc.contributor.other | Amini, Fazel | |
dc.date.accessioned | 2020-09-14T06:33:20Z | |
dc.date.available | 2020-09-14T06:33:20Z | |
dc.date.issued | 2020-09-08 | |
dc.description.abstract | DariCoref, a Dari corpus annotated for anaphoric relations, where all documents are collected from Dari VOA and Azadi Radio. The annotation scheme follows the OntoNotes and WikiCoref. Each markable annotated with coreference type (Identical, Attributive, and Copular), and mention type (Named Entity, Noun Phrase, and Pronominal). Since this is the first annotation efforts concentrate on very specific types of written text, mainly newswire, there is a lack of resources for Dari texts. Therefore, we present a freely available resource we devised for the task of coreference resolution algorithms dedicated to Dari texts. The annotation has been processed by MMAX2 tool. | en |
dc.identifier.uri | https://depositonce.tu-berlin.de/handle/11303/11644 | |
dc.identifier.uri | http://dx.doi.org/10.14279/depositonce-10532 | |
dc.language.iso | en | en |
dc.relation.references | 10.14279/depositonce-10447 | |
dc.relation.references | 10.14279/depositonce-10413 | |
dc.relation.references | 10.14279/depositonce-10437 | |
dc.relation.references | http://dx.doi.org/10.14279/depositonce-10420 | |
dc.rights.uri | https://choosealicense.com/licenses/gpl-3.0/ | en |
dc.subject.ddc | 000 informatics, information science, general works | de |
dc.subject.other | DariCoref | en |
dc.subject.other | Dari NLP Resources | en |
dc.subject.other | Dari Coreference Resolution Dataset | en |
dc.title | Dari Dataset for Coreference Resolution | en |
dc.type | Textual Data | en |
tub.accessrights.dnb | unknown | * |
tub.affiliation | Fak. 4 Elektrotechnik und Informatik::Inst. Softwaretechnik und Theoretische Informatik::FG Modelle und Theorie Verteilter Systeme | de |
tub.affiliation.faculty | Fak. 4 Elektrotechnik und Informatik | de |
tub.affiliation.group | FG Modelle und Theorie Verteilter Systeme | de |
tub.affiliation.institute | Inst. Softwaretechnik und Theoretische Informatik | de |
Files
Original bundle
1 - 2 of 2
Loading…
- Name:
- DariCoref.zip
- Size:
- 7.27 MB
- Format:
- ZIP archive format.
- Description:
- DariCoref, a Dari corpus annotated for anaphoric relations, where all documents are collected from Azadi Radio and Dari VOA. The annotation scheme follows the OntoNotes and WikiCoref..
License bundle
1 - 1 of 1
Loading…
- Name:
- license.txt
- Size:
- 2.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: