{"title":"选举数据中的异常检测及其对美国基础设施脆弱性的表征","authors":"Jason Green","doi":"10.1109/iemcon53756.2021.9623111","DOIUrl":null,"url":null,"abstract":"The purpose of this paper is to showcase an idea to research election data fraud attempting to alter outcomes and to assess if the implications fall in line with weaknesses of the U.S. infrastructures. By employing supervised and unsupervised machine learning techniques such as Decision Tree, Random Forest, and Isolation Forest on the 2016 U.S. Presidential Election and Polling datasets, this paper explores potential data fraud via any possible detected anomalies. Through the experiment and analysis, results indicate a ~9% anomalous data entries in the polling results dataset. Due to lack of ground truth on the latter dataset, it is impossible to determine its accuracy. Therefore, the link between possible anomalies and data fraud attempts cannot be drawn. Further research can be done to better examine this link. Despite that, sufficient known publications about the dangers of data manipulation, especially to US infrastructures, can already indicate an alarming vulnerability of the US infrastructures.","PeriodicalId":272590,"journal":{"name":"2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Anomaly Detection in Election Data and its Representation of U.S. Infrastructure Vulnerability\",\"authors\":\"Jason Green\",\"doi\":\"10.1109/iemcon53756.2021.9623111\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The purpose of this paper is to showcase an idea to research election data fraud attempting to alter outcomes and to assess if the implications fall in line with weaknesses of the U.S. infrastructures. By employing supervised and unsupervised machine learning techniques such as Decision Tree, Random Forest, and Isolation Forest on the 2016 U.S. Presidential Election and Polling datasets, this paper explores potential data fraud via any possible detected anomalies. Through the experiment and analysis, results indicate a ~9% anomalous data entries in the polling results dataset. Due to lack of ground truth on the latter dataset, it is impossible to determine its accuracy. Therefore, the link between possible anomalies and data fraud attempts cannot be drawn. Further research can be done to better examine this link. Despite that, sufficient known publications about the dangers of data manipulation, especially to US infrastructures, can already indicate an alarming vulnerability of the US infrastructures.\",\"PeriodicalId\":272590,\"journal\":{\"name\":\"2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/iemcon53756.2021.9623111\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iemcon53756.2021.9623111","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Anomaly Detection in Election Data and its Representation of U.S. Infrastructure Vulnerability
The purpose of this paper is to showcase an idea to research election data fraud attempting to alter outcomes and to assess if the implications fall in line with weaknesses of the U.S. infrastructures. By employing supervised and unsupervised machine learning techniques such as Decision Tree, Random Forest, and Isolation Forest on the 2016 U.S. Presidential Election and Polling datasets, this paper explores potential data fraud via any possible detected anomalies. Through the experiment and analysis, results indicate a ~9% anomalous data entries in the polling results dataset. Due to lack of ground truth on the latter dataset, it is impossible to determine its accuracy. Therefore, the link between possible anomalies and data fraud attempts cannot be drawn. Further research can be done to better examine this link. Despite that, sufficient known publications about the dangers of data manipulation, especially to US infrastructures, can already indicate an alarming vulnerability of the US infrastructures.