{"title":"基于N-gram的土耳其日报twitter账号识别方法","authors":"İslam Mayda, Mirsat Yesiltepe","doi":"10.1109/IDAP.2017.8090209","DOIUrl":null,"url":null,"abstract":"Twitter is one of the most popular social media networks in the world. It is also mostly used by corporate companies, media as well as individual users. Media organizations use Twitter to announce about the news. Although the language of the given news is formal and preferred words to share information are different for each organization. In this study, we proposed an approach to recognize the Twitter accounts of Turkish daily newspapers. Our approach is based on character 3-grams and word 2-grams for digitizing the texts. In order to classify the information, we performed the experiments on several classifiers and found that Sequential Minimal Optimization (SMO) outperformed other algorithms. We carried out the experiments on the real-dataset of Twitter accounts of Turkish daily newspapers and classified them accurately more than 98%.1","PeriodicalId":111721,"journal":{"name":"2017 International Artificial Intelligence and Data Processing Symposium (IDAP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"N-gram based approach to recognize the twitter accounts of Turkish daily newspapers\",\"authors\":\"İslam Mayda, Mirsat Yesiltepe\",\"doi\":\"10.1109/IDAP.2017.8090209\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Twitter is one of the most popular social media networks in the world. It is also mostly used by corporate companies, media as well as individual users. Media organizations use Twitter to announce about the news. Although the language of the given news is formal and preferred words to share information are different for each organization. In this study, we proposed an approach to recognize the Twitter accounts of Turkish daily newspapers. Our approach is based on character 3-grams and word 2-grams for digitizing the texts. In order to classify the information, we performed the experiments on several classifiers and found that Sequential Minimal Optimization (SMO) outperformed other algorithms. We carried out the experiments on the real-dataset of Twitter accounts of Turkish daily newspapers and classified them accurately more than 98%.1\",\"PeriodicalId\":111721,\"journal\":{\"name\":\"2017 International Artificial Intelligence and Data Processing Symposium (IDAP)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Artificial Intelligence and Data Processing Symposium (IDAP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IDAP.2017.8090209\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Artificial Intelligence and Data Processing Symposium (IDAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IDAP.2017.8090209","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
N-gram based approach to recognize the twitter accounts of Turkish daily newspapers
Twitter is one of the most popular social media networks in the world. It is also mostly used by corporate companies, media as well as individual users. Media organizations use Twitter to announce about the news. Although the language of the given news is formal and preferred words to share information are different for each organization. In this study, we proposed an approach to recognize the Twitter accounts of Turkish daily newspapers. Our approach is based on character 3-grams and word 2-grams for digitizing the texts. In order to classify the information, we performed the experiments on several classifiers and found that Sequential Minimal Optimization (SMO) outperformed other algorithms. We carried out the experiments on the real-dataset of Twitter accounts of Turkish daily newspapers and classified them accurately more than 98%.1