Please use this identifier to cite or link to this item:
http://dx.doi.org/10.14279/depositonce-10413
For citation please use:
For citation please use:
Main Title: | Dari Dataset for Named Entity Recognition DariNER1 |
Author(s): | Zia, Ghezal Ahmad Jan |
Type: | Textual Data |
References: | http://dx.doi.org/10.14279/depositonce-10437 http://dx.doi.org/10.14279/depositonce-10420 http://dx.doi.org/10.14279/depositonce-10447 http://dx.doi.org/10.14279/depositonce-10532 |
Language Code: | und |
Abstract: | DariNER1 is the collection of the data from Dari newswire domains. This dataset is developed based on the IO encoding scheme which following four types of named entities such as Person, Location, Organization, and Miscellaneous. The data follow the Dari pure orthographic structure and collected from Dari VOA news, Azadi Radio and Kankor (University National Entry Exam) from Higher Education of Afghanistan. |
URI: | https://depositonce.tu-berlin.de/handle/11303/11529 http://dx.doi.org/10.14279/depositonce-10413 |
Issue Date: | 23-Jul-2020 |
Date Available: | 28-Jul-2020 |
DDC Class: | 000 informatics, information science, general works |
Subject(s): | Dari NER Corpus Dari Information Extraction |
License: | https://choosealicense.com/licenses/gpl-3.0/ |
Notes: | File is encoded as UTF-8 with arabic characters. |
Appears in Collections: | FG Modelle und Theorie Verteilter Systeme » Research Data |
Files in This Item:
DariNER1.csv
Download
This dataset is designed based on IO encoding and contains four types of named entities, such as Person, Location, Organization and Miscellaneous
Format: CSV | Size: 1.53 MBDownload
Items in DepositOnce are protected by copyright, with all rights reserved, unless otherwise indicated.