{"title":"基于命名实体识别和短语检测的假新闻识别特征","authors":"H. Al-Ash, W. Wibowo","doi":"10.1109/ICITEED.2018.8534898","DOIUrl":null,"url":null,"abstract":"Information explosion that can be generated by anyone may lead to the spread of fake news not only at the news channel, but also at social media, and so forth. Detection of fake news has become an urgent need on the society because of fake news spread of unrest in the society. Several related studies have been conducted in the news classification with the aim of providing a decision whether a news is included in fake news or original news. In the related research, a vector representation of documents is used. This vector representation is then given to the algorithm for further processing. This study aims to model vectors that can accommodate the characteristics of fake news before further processed by language algorithms using the Indonesian language. In this research, fake news and original news are represented according to the vector space model. Vector model combination of frequency term, inverse document frequency and frequency reversed with 10-fold cross validation using support vector machine algorithm classifier. Variations of phrase detection as well as name recognition entities (entity recognition names) are also used in vector representation. A vector representation that uses the term frequency shows promising performance. It can recognize news characteristics correctly 96.74% of 2516 documents across phrase detection and named entity recognition process.","PeriodicalId":142523,"journal":{"name":"2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Fake News Identification Characteristics Using Named Entity Recognition and Phrase Detection\",\"authors\":\"H. Al-Ash, W. Wibowo\",\"doi\":\"10.1109/ICITEED.2018.8534898\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Information explosion that can be generated by anyone may lead to the spread of fake news not only at the news channel, but also at social media, and so forth. Detection of fake news has become an urgent need on the society because of fake news spread of unrest in the society. Several related studies have been conducted in the news classification with the aim of providing a decision whether a news is included in fake news or original news. In the related research, a vector representation of documents is used. This vector representation is then given to the algorithm for further processing. This study aims to model vectors that can accommodate the characteristics of fake news before further processed by language algorithms using the Indonesian language. In this research, fake news and original news are represented according to the vector space model. Vector model combination of frequency term, inverse document frequency and frequency reversed with 10-fold cross validation using support vector machine algorithm classifier. Variations of phrase detection as well as name recognition entities (entity recognition names) are also used in vector representation. A vector representation that uses the term frequency shows promising performance. It can recognize news characteristics correctly 96.74% of 2516 documents across phrase detection and named entity recognition process.\",\"PeriodicalId\":142523,\"journal\":{\"name\":\"2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICITEED.2018.8534898\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICITEED.2018.8534898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fake News Identification Characteristics Using Named Entity Recognition and Phrase Detection
Information explosion that can be generated by anyone may lead to the spread of fake news not only at the news channel, but also at social media, and so forth. Detection of fake news has become an urgent need on the society because of fake news spread of unrest in the society. Several related studies have been conducted in the news classification with the aim of providing a decision whether a news is included in fake news or original news. In the related research, a vector representation of documents is used. This vector representation is then given to the algorithm for further processing. This study aims to model vectors that can accommodate the characteristics of fake news before further processed by language algorithms using the Indonesian language. In this research, fake news and original news are represented according to the vector space model. Vector model combination of frequency term, inverse document frequency and frequency reversed with 10-fold cross validation using support vector machine algorithm classifier. Variations of phrase detection as well as name recognition entities (entity recognition names) are also used in vector representation. A vector representation that uses the term frequency shows promising performance. It can recognize news characteristics correctly 96.74% of 2516 documents across phrase detection and named entity recognition process.