{"title":"Kurdish social media sentiment corpus: Misyar marriage perspectives","authors":"Sarkhel H. Taher Karim","doi":"10.1016/j.dib.2024.110989","DOIUrl":null,"url":null,"abstract":"<div><div>This article presents a thorough compilation of 5108 Central Kurdish comments taken from YouTube and Facebook. The purpose of compiling the dataset was to investigate public perceptions of Misyar marriage, a non-traditional form of marriage, in the Kurdistan region. The goal of the 135-day data collection period was to gather comments from specific public pages on these social media platforms. there are two columns in the dataset: sentiments and comments. The sentiments column classifies each comment into one of eight sentiment labels: Positive, Negative, Neutral, Sarcastic or Humorous, Suggestive, Dismissive, Skeptical, and Curious. The comments column contains the text of the comments in Central Kurdish. To improve the quality and uniformity of the data, a great deal of preprocessing was done to address problems like noise removal, character replacement, and space adjustments.</div><div>Researchers interested in sentiment analysis, social media studies, Islamic studies, and Kurdish cultural practices will find the dataset to be a useful resource. It can be used for sentiment analysis, trend analysis, linguistic studies, and other analyses. It provides insights into the public discourse surrounding Misyar marriage. The labeled data can aid in the creation of machine learning models and further our knowledge of societal perceptions of emerging religious trends<em>.</em></div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S235234092400951X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
This article presents a thorough compilation of 5108 Central Kurdish comments taken from YouTube and Facebook. The purpose of compiling the dataset was to investigate public perceptions of Misyar marriage, a non-traditional form of marriage, in the Kurdistan region. The goal of the 135-day data collection period was to gather comments from specific public pages on these social media platforms. there are two columns in the dataset: sentiments and comments. The sentiments column classifies each comment into one of eight sentiment labels: Positive, Negative, Neutral, Sarcastic or Humorous, Suggestive, Dismissive, Skeptical, and Curious. The comments column contains the text of the comments in Central Kurdish. To improve the quality and uniformity of the data, a great deal of preprocessing was done to address problems like noise removal, character replacement, and space adjustments.
Researchers interested in sentiment analysis, social media studies, Islamic studies, and Kurdish cultural practices will find the dataset to be a useful resource. It can be used for sentiment analysis, trend analysis, linguistic studies, and other analyses. It provides insights into the public discourse surrounding Misyar marriage. The labeled data can aid in the creation of machine learning models and further our knowledge of societal perceptions of emerging religious trends.
期刊介绍:
Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.