Fatima Zahra Qachfar, Rakesh M. Verma, Arjun Mukherjee
{"title":"Leveraging Synthetic Data and PU Learning For Phishing Email Detection","authors":"Fatima Zahra Qachfar, Rakesh M. Verma, Arjun Mukherjee","doi":"10.1145/3508398.3511524","DOIUrl":"https://doi.org/10.1145/3508398.3511524","url":null,"abstract":"Imbalanced data classification has always been one of the most challenging problems in data science especially in the cybersecurity field, where we observe an out-of-balance proportion between benign and phishing examples in security datasets. Even though there are many phishing detection methods in literature, most of them neglect the imbalanced nature of phishing email datasets. In this paper, we examine the imbalanced property by varying legitimate to phishing class ratios. We generate new synthetic instances using a generative adversarial network model for long sentences (LeakGAN) to balance out the training process and ameliorate its impact on classification. These synthetic instances are labeled by positive-unlabeled learning and added to the initial imbalanced training set. The resulting dataset is given to the Bidirectional Encoder Representations from Transformers (BERT) model for sequence classification. We compare several state-of-the-art methods from the literature against our approach, which achieves a high performance throughout all the imbalanced ratios reaching an F1-score of 99.6% for the most extreme imbalanced ratio and an F1-score of 99.8% for balanced cases.","PeriodicalId":102306,"journal":{"name":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121320400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Qubit Reset and Refresh: A Gamechanger for Random Number Generation","authors":"Julie Germain, R. Dantu, Mark A. Thompson","doi":"10.1145/3508398.3519364","DOIUrl":"https://doi.org/10.1145/3508398.3519364","url":null,"abstract":"Generation of random binary numbers for cryptographic use is often addressed using pseudorandom number generating functions in compilers and specialized cryptographic packages. Using the IBM's Qiskit reset functionality, we were able to implement a straight-forward in-line Python function that returns a list of quantum-generated random numbers, by creating and executing a circuit on IBM quantum systems. We successfully created a list of 1000 1024-bit binary random numbers as well as a list of 40,000 25-bit binary random numbers for randomness testing, using the NIST Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications. The quantum-generated random data we tested showed very strong randomness, according to the NIST suite. Previously, IBM's quantum implementation required a single qubit for each bit of data generated in a circuit, making generation of large random numbers impractical. IBM's addition of the reset instruction eliminates this restriction and allows for the creation of functions that can generate a larger quantity of data-bit output, using only a small number of qubits.","PeriodicalId":102306,"journal":{"name":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121549365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recovering Structure of Input of a Binary Program","authors":"Seshagiri Prabhu Narasimha, Arun Lakhotia","doi":"10.1145/3508398.3511508","DOIUrl":"https://doi.org/10.1145/3508398.3511508","url":null,"abstract":"This paper presents an algorithm to automatically infer a recursive state machine (RSM) describing the space of acceptable input of an arbitrary binary program by executing that program with one or more valid inputs. The algorithm automatically identifies atomic fields of fixed and variable lengths and syntactic elements, such as separators and terminators, and generalizes them into regular expression tokens. It constructs an RSM of tokens to represent structures such as arrays and records. Further, it constructs nested states in the RSM to represent complex, nested structures. The RSM may serve as an independent parser for the program's acceptable inputs. A controlled experiment was performed using a prototype implementation of the algorithm and a set of synthetic programs with input formats that mimic characteristics of conventional data formats, such as CSV, PNG, PE file, etc. The experiment demonstrates that the inferred RSMs correctly identify the syntactic elements and their grammatical orderings. When used as generators, the RSMs also produced syntactically correct data for the formats that use terminators to end a sequence of elements, but not so when the format maintains a count of elements for variable length fields instead of a terminator. Experiments with real-world programs produced similar results.","PeriodicalId":102306,"journal":{"name":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128351563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Poster Session I","authors":"Hong-yu Hu","doi":"10.1145/3532571","DOIUrl":"https://doi.org/10.1145/3532571","url":null,"abstract":"","PeriodicalId":102306,"journal":{"name":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128532158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 5: IoT Security","authors":"Maanak Gupta","doi":"10.1145/3532566","DOIUrl":"https://doi.org/10.1145/3532566","url":null,"abstract":"","PeriodicalId":102306,"journal":{"name":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121618801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elizabeth Miller, Md. Rashedur Rahman, Moinul Hossain, Aisha I. Ali-Gombe
{"title":"I Don't Know Why You Need My Data: A Case Study of Popular Social Media Privacy Policies","authors":"Elizabeth Miller, Md. Rashedur Rahman, Moinul Hossain, Aisha I. Ali-Gombe","doi":"10.1145/3508398.3519359","DOIUrl":"https://doi.org/10.1145/3508398.3519359","url":null,"abstract":"Data privacy, a critical human right, is gaining importance as new technologies are developed, and the old ones evolve. In mobile platforms such as Android, data privacy regulations require developers to communicate data access requests using privacy policy statements (PPS). This case study cross-examines the PPS in popular social media (SM) apps --- Facebook and Twitter --- for features of language ambiguity, sensitive data requests, and whether the statements tally with the data requests made in the Manifest file. Subsequently, we conduct a comparative analysis between the PPS of these two apps to examine trends that may constitute a threat to user data privacy.","PeriodicalId":102306,"journal":{"name":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132781403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Security Analysis of IoT Frameworks using Static Taint Analysis","authors":"Tuba Yavuz, Christopher Brant","doi":"10.1145/3508398.3511511","DOIUrl":"https://doi.org/10.1145/3508398.3511511","url":null,"abstract":"Internet of Things (IoT) frameworks are designed to facilitate provisioning and secure operation of IoT devices. A typical IoT framework consists of various software layers and components including third-party libraries, communication protocol stacks, the Hardware Abstraction Layer (HAL), the kernel, and the apps. IoT frameworks have implicit data flows in addition to explicit data flows due to their event-driven nature. In this paper, we present a static taint tracking framework, IFLOW, that facilitates the security analysis of system code by enabling specification of data-flow queries that can refer to a variety of software entities. We have formulated various security relevant data-flow queries and solved them using IFLOW to analyze the security of several popular IoT frameworks: Amazon FreeRTOS SDK, SmartThings SDK, and Google IoT SDK. Our results show that IFLOW can both detect real bugs and localize security analysis to the relevant components of IoT frameworks.","PeriodicalId":102306,"journal":{"name":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","volume":"66 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131563036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How (Not) to Deploy Cryptography on the Internet","authors":"Haya Shulman","doi":"10.1145/3508398.3511270","DOIUrl":"https://doi.org/10.1145/3508398.3511270","url":null,"abstract":"The core protocols in the Internet infrastructure play a central role in delivering packets to their destination. The inter-domain routing with BGP (Border Gateway Protocol) computes the correct paths in the global Internet, and DNS (Domain Name System) looks up the destination addresses. Due to their critical function they are often attacked: the adversaries redirect victims to malicious servers or networks by making them traverse incorrect routes or reach incorrect destinations, e.g., for cyber-espionage, for spam distribution, for theft of crypto-currency, for censorship [1, 4-6]. This results in relatively stealthy attacks which cannot be immediately detected and prevented [2, 3]. By the time the attacks are detected, damage was already done. The frequent attacks along with the devastating damages that they incur, motivates the deployment of cryptographic defences to secure the Internet infrastructure. Multiple efforts are devoted to protecting the core Internet protocols with cryptographic mechanisms, BGP with RPKI and DNS with DNSSEC. Recently the deployment of these defences took off, and many networks and DNS servers in the Internet already adopted them. We review the deployed defences and show that the tradeoffs made by the operators or developers can be exploited to disable the cryptographic defences. We also provide mitigations and discuss challenges in their adoption.","PeriodicalId":102306,"journal":{"name":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133034973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 2: Privacy","authors":"M. Fernández","doi":"10.1145/3532563","DOIUrl":"https://doi.org/10.1145/3532563","url":null,"abstract":"","PeriodicalId":102306,"journal":{"name":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121804229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Keynote Talk 1","authors":"Rakesh M. Verma","doi":"10.1145/3264869.3286580","DOIUrl":"https://doi.org/10.1145/3264869.3286580","url":null,"abstract":"","PeriodicalId":102306,"journal":{"name":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129994200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}