FakeKurdNews---Fake-Kurdish-News-Dataset

Context The majority of the previous studies got focused on detecting fake news in the English language due to the availability of well-known annotated fake corpus openly available, variety of fact-checkers around the world while the less-resourced languages left behind such as the Kurdish language. While the Kurdish language is spoken by more than 30 million people around the world, yet, it is considered as less-resourced in the Natural Language Processing (NLP) domain due to the inaccessibility of NLP tools and the shortage or unavailability of the labeled corpus. This is a repository for a fake news dataset for a research project at the College of Informatics, Sulaimani Polytechnic University, Iraq.

In this paper: full details about data collection, pre-processing and classifiers used on this dataset.

Content Our dataset consists of 3 sets of news articles crawled from Facebook pages in KurdKurdish language only in different subjects. The dataset consists of a set of articles/news labeled by 0 (fake) or 1 (credible).

The dataset consists of: -5000 articles labeled as true -5000 articles labeled false -5000 articles automatically modified from the real articles to create fake news

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
FakeGenerated.csv		FakeGenerated.csv
README.md		README.md
fake-5000.csv		fake-5000.csv
real-5000.csv		real-5000.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FakeKurdNews---Fake-Kurdish-News-Dataset

About

Uh oh!

Releases

Packages

rania-azad/FakeKurdNews---Fake-Kurdish-News-Dataset

Folders and files

Latest commit

History

Repository files navigation

FakeKurdNews---Fake-Kurdish-News-Dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages