Please use this identifier to cite or link to this item: http://dx.doi.org/10.14279/depositonce-10447
For citation please use:
Full metadata record
DC FieldValueLanguage
dc.contributor.authorZia, Ghezal Ahmad Jan-
dc.date.accessioned2020-08-11T06:47:05Z-
dc.date.available2020-08-11T06:47:05Z-
dc.date.issued2020-08-08-
dc.identifier.urihttps://depositonce.tu-berlin.de/handle/11303/11562-
dc.identifier.urihttp://dx.doi.org/10.14279/depositonce-10447-
dc.descriptionFile is encoded as UTF-8 with arabic characters.en
dc.description.abstractDariNER2 is the release of the Dari sentence-level Named Entity annotated dataset, collected from Dari Azadi Radio. The goal of the project was to annotate a corpus comprising various genres of text (news, newsgroups, and interviews) in the Dari language with structural information (syntax). In addition, it is developed to support sentence-level ambiguity in the Dari text. It contains 883 sentences, 22K word/token. It is manually annotated and used the person (PER), location (LOC), organization (ORG), and miscellaneous (MISC) classes.en
dc.language.isounden
dc.rights.urihttps://choosealicense.com/licenses/gpl-3.0/en
dc.subject.ddc000 informatics, information science, general worksde
dc.subject.otherDari Named Entity Recognition Corpusen
dc.subject.otherDari NLP Resourcesen
dc.titleDari Dataset for Named Entity Recognition DariNER2en
dc.typeGeneric Research Dataen
dc.relation.referenceshttp://dx.doi.org/10.14279/depositonce-10413-
dc.relation.referenceshttp://dx.doi.org/10.14279/depositonce-10437-
dc.relation.referenceshttp://dx.doi.org/10.14279/depositonce-10420-
dc.relation.referenceshttp://dx.doi.org/10.14279/depositonce-10532en
tub.accessrights.dnbunknown*
Appears in Collections:FG Modelle und Theorie Verteilter Systeme » Research Data

Files in This Item:
DariNER2.csv

Sentence-level Dari Named Entity annotated dataset for Named Entity Recognition Task.

Format: CSV | Size: 353.9 kB
Download

Item Export Bar

Items in DepositOnce are protected by copyright, with all rights reserved, unless otherwise indicated.