Please use this identifier to cite or link to this item:
For citation please use:
Main Title: Dari Dataset for Named Entity Recognition DariNER1
Author(s): Zia, Ghezal Ahmad Jan
Type: Textual Data
Abstract: DariNER1 is the collection of the data from Dari newswire domains. This dataset is developed based on the IO encoding scheme which following four types of named entities such as Person, Location, Organization, and Miscellaneous. The data follow the Dari pure orthographic structure and collected from Dari VOA news, Azadi Radio and Kankor (University National Entry Exam) from Higher Education of Afghanistan.
Subject(s): Dari NER Corpus
Dari Information Extraction
Issue Date: 23-Jul-2020
Date Available: 28-Jul-2020
Language Code: und
DDC Class: 000 informatics, information science, general works
Notes: File is encoded as UTF-8 with arabic characters.
TU Affiliation(s): Fak. 4 Elektrotechnik und Informatik » Inst. Softwaretechnik und Theoretische Informatik » FG Modelle und Theorie Verteilter Systeme
Appears in Collections:Technische Universit├Ąt Berlin » Research Data

Files in This Item:

This dataset is designed based on IO encoding and contains four types of named entities, such as Person, Location, Organization and Miscellaneous

Format: CSV | Size: 1.53 MB

Item Export Bar

Items in DepositOnce are protected by copyright, with all rights reserved, unless otherwise indicated.