{"title":"大规模API废弃案例研究:国会暴乱后Parler数据泄露","authors":"David Redding, J. Ang, S. Bhunia","doi":"10.23919/SpliTech55088.2022.9854293","DOIUrl":null,"url":null,"abstract":"After the United States Capitol Hill Riots, there was a massive API scraping of Parler, an open social media platform, which resulted in 70 terabytes of user data being collected. The data breach, a serious confidential personal data leak, was not performed illegally. This paper analyzes the data breach and its impact in depth. The breach was a result of a hacktivist going with the alias @donk_enby, performing a massive API scraping of Parler's servers. The scraping took metadata from user's public, private, and previously deleted posts, uploaded to Parler's servers. Parler had failed to clear the metadata of these posts. The metadata contained names, dates, locations, and other data about the users who posted content to Parler's site. Over 70,000 GPS locations of Parler's users have been uncovered including users' private properties. These locations have also been used to tie citizens to the Capitol Riots if they uploaded any content about the riot from that day. Forms containing government identification of users were also leaked from Parler's servers that were used for account verification. This paper demonstrate background on the events leading up to, including, and following the Capitol Riots. The paper also examine the hacktivist's methodology for performing the API scraping and discuss possible defensive strategies such as API rate limiting, API request sanitation, and API call authorization.","PeriodicalId":295373,"journal":{"name":"2022 7th International Conference on Smart and Sustainable Technologies (SpliTech)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Case Study of Massive API Scrapping: Parler Data Breach After the Capitol Riot\",\"authors\":\"David Redding, J. Ang, S. Bhunia\",\"doi\":\"10.23919/SpliTech55088.2022.9854293\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"After the United States Capitol Hill Riots, there was a massive API scraping of Parler, an open social media platform, which resulted in 70 terabytes of user data being collected. The data breach, a serious confidential personal data leak, was not performed illegally. This paper analyzes the data breach and its impact in depth. The breach was a result of a hacktivist going with the alias @donk_enby, performing a massive API scraping of Parler's servers. The scraping took metadata from user's public, private, and previously deleted posts, uploaded to Parler's servers. Parler had failed to clear the metadata of these posts. The metadata contained names, dates, locations, and other data about the users who posted content to Parler's site. Over 70,000 GPS locations of Parler's users have been uncovered including users' private properties. These locations have also been used to tie citizens to the Capitol Riots if they uploaded any content about the riot from that day. Forms containing government identification of users were also leaked from Parler's servers that were used for account verification. This paper demonstrate background on the events leading up to, including, and following the Capitol Riots. The paper also examine the hacktivist's methodology for performing the API scraping and discuss possible defensive strategies such as API rate limiting, API request sanitation, and API call authorization.\",\"PeriodicalId\":295373,\"journal\":{\"name\":\"2022 7th International Conference on Smart and Sustainable Technologies (SpliTech)\",\"volume\":\"82 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 7th International Conference on Smart and Sustainable Technologies (SpliTech)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/SpliTech55088.2022.9854293\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 7th International Conference on Smart and Sustainable Technologies (SpliTech)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/SpliTech55088.2022.9854293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Case Study of Massive API Scrapping: Parler Data Breach After the Capitol Riot
After the United States Capitol Hill Riots, there was a massive API scraping of Parler, an open social media platform, which resulted in 70 terabytes of user data being collected. The data breach, a serious confidential personal data leak, was not performed illegally. This paper analyzes the data breach and its impact in depth. The breach was a result of a hacktivist going with the alias @donk_enby, performing a massive API scraping of Parler's servers. The scraping took metadata from user's public, private, and previously deleted posts, uploaded to Parler's servers. Parler had failed to clear the metadata of these posts. The metadata contained names, dates, locations, and other data about the users who posted content to Parler's site. Over 70,000 GPS locations of Parler's users have been uncovered including users' private properties. These locations have also been used to tie citizens to the Capitol Riots if they uploaded any content about the riot from that day. Forms containing government identification of users were also leaked from Parler's servers that were used for account verification. This paper demonstrate background on the events leading up to, including, and following the Capitol Riots. The paper also examine the hacktivist's methodology for performing the API scraping and discuss possible defensive strategies such as API rate limiting, API request sanitation, and API call authorization.