{"title":"基于云处理能力的假新闻分类BERT模型","authors":"Athiya Marium, G. Mamatha","doi":"10.1109/R10-HTC53172.2021.9641632","DOIUrl":null,"url":null,"abstract":"This paper aims at conducting a predictive analysis on news articles in order to find if they are fake or real. After conducting an extensive research on the topic, various Machine Learning and Deep Learning models for the purpose of evaluating news articles were discovered. A new transfer learning model, Bi-directional Encoder Representation for Transformers (BERT), is tested using the Google Cloud GPU capacity for the purpose of detection. The first step in this direction will be to pre-process the data to clean out the garbage and missing values. After this, all the news articles collected will be tokenized, according to the BERT tokenizer. The tokenized corpus will be converted into tensors for the model to be trained. The data will be trained in batches with each batch having 32 articles. The final layer for training will consist of a five layered neural network. The model with the least validation loss will be tested for accuracy. The predictions for news articles will be made on this model. The paper will also explore the best cloud platform to host such a model and performance of the hosted model as well.","PeriodicalId":117626,"journal":{"name":"2021 IEEE 9th Region 10 Humanitarian Technology Conference (R10-HTC)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BERT Model for Classification of Fake News using the Cloud Processing Capacity\",\"authors\":\"Athiya Marium, G. Mamatha\",\"doi\":\"10.1109/R10-HTC53172.2021.9641632\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper aims at conducting a predictive analysis on news articles in order to find if they are fake or real. After conducting an extensive research on the topic, various Machine Learning and Deep Learning models for the purpose of evaluating news articles were discovered. A new transfer learning model, Bi-directional Encoder Representation for Transformers (BERT), is tested using the Google Cloud GPU capacity for the purpose of detection. The first step in this direction will be to pre-process the data to clean out the garbage and missing values. After this, all the news articles collected will be tokenized, according to the BERT tokenizer. The tokenized corpus will be converted into tensors for the model to be trained. The data will be trained in batches with each batch having 32 articles. The final layer for training will consist of a five layered neural network. The model with the least validation loss will be tested for accuracy. The predictions for news articles will be made on this model. The paper will also explore the best cloud platform to host such a model and performance of the hosted model as well.\",\"PeriodicalId\":117626,\"journal\":{\"name\":\"2021 IEEE 9th Region 10 Humanitarian Technology Conference (R10-HTC)\",\"volume\":\"86 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 9th Region 10 Humanitarian Technology Conference (R10-HTC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/R10-HTC53172.2021.9641632\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 9th Region 10 Humanitarian Technology Conference (R10-HTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/R10-HTC53172.2021.9641632","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
BERT Model for Classification of Fake News using the Cloud Processing Capacity
This paper aims at conducting a predictive analysis on news articles in order to find if they are fake or real. After conducting an extensive research on the topic, various Machine Learning and Deep Learning models for the purpose of evaluating news articles were discovered. A new transfer learning model, Bi-directional Encoder Representation for Transformers (BERT), is tested using the Google Cloud GPU capacity for the purpose of detection. The first step in this direction will be to pre-process the data to clean out the garbage and missing values. After this, all the news articles collected will be tokenized, according to the BERT tokenizer. The tokenized corpus will be converted into tensors for the model to be trained. The data will be trained in batches with each batch having 32 articles. The final layer for training will consist of a five layered neural network. The model with the least validation loss will be tested for accuracy. The predictions for news articles will be made on this model. The paper will also explore the best cloud platform to host such a model and performance of the hosted model as well.