{"title":"使用自然语言处理和机器学习技术的泰国标题党检测算法","authors":"Praphan Klairith, Sansiri Tanachutiwat","doi":"10.1109/ICEAST.2018.8434447","DOIUrl":null,"url":null,"abstract":"This paper proposes the approach based on machine learning for detection of Thai clickbait. The clickbait messages often adopt eye-catching on wording, lagging of information on a content to attract visitors. We contribute the clickbait corpus by crowdsourcing, 30,000 of headlines are selected to draw up the dataset. In this work attempt to develop clickbait detection model using two type of features in the embedding layer and three different of networks in the hidden layer. BiLSTM with word level embedding performs very well achieving accuracy rate of 0.98, fl-score of 0.98.","PeriodicalId":138654,"journal":{"name":"2018 International Conference on Engineering, Applied Sciences, and Technology (ICEAST)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Thai Clickbait Detection Algorithms Using Natural Language Processing with Machine Learning Techniques\",\"authors\":\"Praphan Klairith, Sansiri Tanachutiwat\",\"doi\":\"10.1109/ICEAST.2018.8434447\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes the approach based on machine learning for detection of Thai clickbait. The clickbait messages often adopt eye-catching on wording, lagging of information on a content to attract visitors. We contribute the clickbait corpus by crowdsourcing, 30,000 of headlines are selected to draw up the dataset. In this work attempt to develop clickbait detection model using two type of features in the embedding layer and three different of networks in the hidden layer. BiLSTM with word level embedding performs very well achieving accuracy rate of 0.98, fl-score of 0.98.\",\"PeriodicalId\":138654,\"journal\":{\"name\":\"2018 International Conference on Engineering, Applied Sciences, and Technology (ICEAST)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 International Conference on Engineering, Applied Sciences, and Technology (ICEAST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEAST.2018.8434447\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Engineering, Applied Sciences, and Technology (ICEAST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEAST.2018.8434447","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Thai Clickbait Detection Algorithms Using Natural Language Processing with Machine Learning Techniques
This paper proposes the approach based on machine learning for detection of Thai clickbait. The clickbait messages often adopt eye-catching on wording, lagging of information on a content to attract visitors. We contribute the clickbait corpus by crowdsourcing, 30,000 of headlines are selected to draw up the dataset. In this work attempt to develop clickbait detection model using two type of features in the embedding layer and three different of networks in the hidden layer. BiLSTM with word level embedding performs very well achieving accuracy rate of 0.98, fl-score of 0.98.