{"title":"疾病分类的社会网络基准数据集","authors":"Muhannad Quwaider, Mosab Alfaqeeh","doi":"10.1109/W-FiCloud.2016.56","DOIUrl":null,"url":null,"abstract":"Social Network Analysis becomes an important field of research focusing on studying users' data and its contributions on social network media. The goal of this study is to build relations between people in the disease field and to analyze certain knowledge or activities. In order to accomplish these goals, investigators become very interest in social network analysis to conclude certain behavior or prediction from various data in social networks. People in the field of sociology expect that the relationship between people and the real-life style can be mirrored in the social networks. On the other hand, manual classification of unstructured data from social networks is almost impossible. Therefore, there is a required for an automatic classification method in order to formulate this data and to be more convenient and accessible. In this paper we are studying data of diseases from Facebook pages. These diseases are associated to the categories of popular diseases such as Ebola, Malaria and HIV/AIDS. In this paper we addressed classifier as a supervised learning task and an innovative dataset named Benchmark Dataset for Diseases Classification (BDDC) is created. BDDC is well-documented dataset and its file formats and compatible with recognized text mining tools and to be utilized in the comparative experiments by other researchers. Three commonly classifiers are used and two versions are BDDC are used. The performance results show that BDDC with stemmer performs better than the one without stemmer because of using stop words filtering and porter.","PeriodicalId":441441,"journal":{"name":"2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW)","volume":"24 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Social Networks Benchmark Dataset for Diseases Classification\",\"authors\":\"Muhannad Quwaider, Mosab Alfaqeeh\",\"doi\":\"10.1109/W-FiCloud.2016.56\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social Network Analysis becomes an important field of research focusing on studying users' data and its contributions on social network media. The goal of this study is to build relations between people in the disease field and to analyze certain knowledge or activities. In order to accomplish these goals, investigators become very interest in social network analysis to conclude certain behavior or prediction from various data in social networks. People in the field of sociology expect that the relationship between people and the real-life style can be mirrored in the social networks. On the other hand, manual classification of unstructured data from social networks is almost impossible. Therefore, there is a required for an automatic classification method in order to formulate this data and to be more convenient and accessible. In this paper we are studying data of diseases from Facebook pages. These diseases are associated to the categories of popular diseases such as Ebola, Malaria and HIV/AIDS. In this paper we addressed classifier as a supervised learning task and an innovative dataset named Benchmark Dataset for Diseases Classification (BDDC) is created. BDDC is well-documented dataset and its file formats and compatible with recognized text mining tools and to be utilized in the comparative experiments by other researchers. Three commonly classifiers are used and two versions are BDDC are used. The performance results show that BDDC with stemmer performs better than the one without stemmer because of using stop words filtering and porter.\",\"PeriodicalId\":441441,\"journal\":{\"name\":\"2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW)\",\"volume\":\"24 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/W-FiCloud.2016.56\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/W-FiCloud.2016.56","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Social Networks Benchmark Dataset for Diseases Classification
Social Network Analysis becomes an important field of research focusing on studying users' data and its contributions on social network media. The goal of this study is to build relations between people in the disease field and to analyze certain knowledge or activities. In order to accomplish these goals, investigators become very interest in social network analysis to conclude certain behavior or prediction from various data in social networks. People in the field of sociology expect that the relationship between people and the real-life style can be mirrored in the social networks. On the other hand, manual classification of unstructured data from social networks is almost impossible. Therefore, there is a required for an automatic classification method in order to formulate this data and to be more convenient and accessible. In this paper we are studying data of diseases from Facebook pages. These diseases are associated to the categories of popular diseases such as Ebola, Malaria and HIV/AIDS. In this paper we addressed classifier as a supervised learning task and an innovative dataset named Benchmark Dataset for Diseases Classification (BDDC) is created. BDDC is well-documented dataset and its file formats and compatible with recognized text mining tools and to be utilized in the comparative experiments by other researchers. Three commonly classifiers are used and two versions are BDDC are used. The performance results show that BDDC with stemmer performs better than the one without stemmer because of using stop words filtering and porter.