Euijin Choo, Mohamed Nabeel, Doowon Kim, Ravindu De Silva, Ting Yu, Issa Khalil
{"title":"A Large Scale Study and Classification of VirusTotal Reports on Phishing and Malware URLs","authors":"Euijin Choo, Mohamed Nabeel, Doowon Kim, Ravindu De Silva, Ting Yu, Issa Khalil","doi":"10.1145/3626790","DOIUrl":"https://doi.org/10.1145/3626790","url":null,"abstract":"VirusTotal (VT) is a widely used scanning service for researchers and practitioners to label malicious entities and predict new security threats. Unfortunately, it is little known to the end-users how VT URL scanners decide on the maliciousness of entities and the attack types they are involved in (e.g., phishing or malware-hosting websites). In this paper, we conduct a systematic comparative study on VT URL scanners' behavior for different attack types of malicious URLs, in terms of 1) detection specialties, 2) stability, 3) correlations between scanners, and 4) lead/lag behaviors. Our findings highlight that the VT scanners commonly disagree with each other on their detection and attack type classification, leading to challenges in ascertaining the maliciousness of a URL and taking prompt mitigation actions according to different attack types. This motivates us to present a new highly accurate classifier that helps correctly identify the attack types of malicious URLs at the early stage. This in turn assists practitioners in performing better threat aggregation and choosing proper mitigation actions for different attack types","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"97 ","pages":"1 - 26"},"PeriodicalIF":0.0,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139011791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ding Zhang, P. Santhalingam, P. Pathak, Zizhan Zheng
{"title":"CoBF: Coordinated Beamforming in Dense mmWave Networks","authors":"Ding Zhang, P. Santhalingam, P. Pathak, Zizhan Zheng","doi":"10.1145/3589975","DOIUrl":"https://doi.org/10.1145/3589975","url":null,"abstract":"With MIMO and enhanced beamforming features, IEEE 802.11ay is poised to create the next generation of mmWave WLANs that can provide over 100 Gbps data rate. However, beamforming between densely deployed APs and clients incurs unacceptable overhead. On the other hand, the absence of up-to-date beamforming information restricts the diversity gains available through MIMO and multi-users, reducing the overall network capacity. This paper presents a novel approach of \"coordinated beamforming\" (called CoBF) where only a small subset of APs are selected for beamforming in the 802.11ay mmWave WLANs. Based on the concept of uncertainty, CoBF predicts the APs whose beamforming information is likely outdated and needs updating. The proposed approach complements the existing per-link beamforming solutions and extends their effectiveness from link-level to network-level. Furthermore, CoBF leverages the AP uncertainty to create MU-MIMO groups through interference-aware scheduling in 802.11ay WLANs. With extensive experimentation and simulations, we show that CoBF can significantly reduce beamforming overhead and improve network capacity for 802.11ay WLANs.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127640297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Liu, Shouqian Shi, Minghao Xie, Heiner Litz, Chen Qian
{"title":"Smash: Flexible, Fast, and Resource-efficient Placement and Lookup of Distributed Storage","authors":"Yi Liu, Shouqian Shi, Minghao Xie, Heiner Litz, Chen Qian","doi":"10.1145/3589977","DOIUrl":"https://doi.org/10.1145/3589977","url":null,"abstract":"Large-scale distributed storage systems, such as object stores, usually apply hashing-based placement and lookup methods to achieve scalability and resource efficiency. However, when object locations are determined by hash values, placement becomes inflexible, failing to optimize or satisfy application requirements such as load balance, failure tolerance, parallelism, and network/system performance. This work presents a novel solution to achieve the best of two worlds: flexibility while maintaining cost-effectiveness and scalability. The proposed method Smash is an object placement and lookup method that achieves full placement flexibility, balanced load, low resource cost, and short latency. Smash utilizes a recent space-efficient data structure and applies it to object-location lookups. We implement Smash as a prototype system and evaluate it in a public cloud. The analysis and experimental results show that Smash achieves full placement flexibility, fast storage operations, fast recovery from node dynamics, and lower DRAM cost (<60%) compared to existing hash-based solutions such as Ceph and MapX.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122800749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haibo Wang, D. Melissourgos, Chaoyi Ma, Shigang Chen
{"title":"Real-time Spread Burst Detection in Data Streaming","authors":"Haibo Wang, D. Melissourgos, Chaoyi Ma, Shigang Chen","doi":"10.1145/3589979","DOIUrl":"https://doi.org/10.1145/3589979","url":null,"abstract":"Data streaming has many applications in network monitoring, web services, e-commerce, stock trading, social networks, and distributed sensing. This paper introduces a new problem of real-time burst detection in flow spread, which differs from the traditional problem of burst detection in flow size. It is practically significant with potential applications in cybersecurity, network engineering, and trend identification on the Internet. It is a challenging problem because estimating flow spread requires us to remember all past data items and detecting bursts in real time requires us to minimize spread estimation overhead, which was not the priority in most prior work. This paper provides the first efficient, real-time solution for spread burst detection. It is designed based on a new real-time super spreader identifier, which outperforms the state of the art in terms of both accuracy and processing overhead. The super spreader identifier is in turn based on a new sketch design for real-time spread estimation, which outperforms the best existing sketches.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132197167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SplitRPC: A {Control + Data} Path Splitting RPC Stack for ML Inference Serving","authors":"Adithya Kumar, A. Sivasubramaniam, T. Zhu","doi":"10.1145/3589974","DOIUrl":"https://doi.org/10.1145/3589974","url":null,"abstract":"The growing adoption of hardware accelerators driven by their intelligent compiler and runtime system counterparts has democratized ML services and precipitously reduced their execution times. This motivates us to shift our attention to efficiently serve these ML services under distributed settings and characterize the overheads imposed by the RPC mechanism ('RPC tax') when serving them on accelerators. The RPC implementations designed over the years implicitly assume the host CPU services the requests, and we focus on expanding such works towards accelerator-based services. While recent proposals calling for SmartNICs to take on this task are reasonable for simple kernels, serving complex ML models requires a more nuanced view to optimize both the data-path and the control/orchestration of these accelerators. We program today's commodity network interface cards (NICs) to split the control and data paths for effective transfer of control while efficiently transferring the payload to the accelerator. As opposed to unified approaches that bundle these paths together, limiting the flexibility in each of these paths, we design and implement SplitRPC - a control + data path optimizing RPC mechanism for ML inference serving. SplitRPC allows us to optimize the datapath to the accelerator while simultaneously allowing the CPU to maintain full orchestration capabilities. We implement SplitRPC on both commodity NICs and SmartNICs and demonstrate how GPU-based ML services running different compiler/runtime systems can benefit. For a variety of ML models served using different inference runtimes, we demonstrate that SplitRPC is effective in minimizing the RPC tax while providing significant gains in throughput and latency over existing kernel by-pass approaches, without requiring expensive SmartNIC devices.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122040162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jia-Jen Lin, T. Ji, Xiangpeng Hao, Hokeun Cha, Yanfang Le, Xiangyao Yu, Aditya Akella
{"title":"Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs","authors":"Jia-Jen Lin, T. Ji, Xiangpeng Hao, Hokeun Cha, Yanfang Le, Xiangyao Yu, Aditya Akella","doi":"10.1145/3589980","DOIUrl":"https://doi.org/10.1145/3589980","url":null,"abstract":"The wide adoption of the emerging SmartNIC technology creates new opportunities to offload application-level computation into the networking layer, which frees the burden of host CPUs, leading to performance improvement. Shuffle, the all-to-all data exchange process, is a critical building block for network communication in distributed data-intensive applications and can potentially benefit from SmartNICs. In this paper, we develop SmartShuffle, which accelerates the data-intensive application's shuffle process by offloading various computation tasks into the SmartNIC devices. SmartShuffle supports offloading both low-level network functions, including data partitioning and network transport, and high-level computation tasks, including filtering, aggregation, and sorting. SmartShuffle adopts a coordinated offload architecture to make sender-side and receiver-side SmartNICs jointly contribute to the benefits of shuffle computation offload. SmartShuffle carefully manages the tight and time-varying computation and memory constraints on the device. We propose a liquid offloading approach, which dynamically migrates operators between the host CPU and the SmartNIC at runtime such that resources in both devices are fully utilized. We prototype SmartShuffle on the Stingray SoC SmartNICs and plug it into Spark. Our evaluation shows that SmartShuffle improves host CPU efficiency and I/O efficiency with lower job completion time. SmartShuffle outperforms Spark, and Spark RDMA by up to 40% on TPC-H.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123640944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Maruf, Yuhong Zhong, Hongyi Wang, Mosharaf Chowdhury, Asaf Cidon, Carl A. Waldspurger
{"title":"Memtrade: Marketplace for Disaggregated Memory Clouds","authors":"H. Maruf, Yuhong Zhong, Hongyi Wang, Mosharaf Chowdhury, Asaf Cidon, Carl A. Waldspurger","doi":"10.1145/3589985","DOIUrl":"https://doi.org/10.1145/3589985","url":null,"abstract":"We present Memtrade, the first practical marketplace for disaggregated memory clouds. Clouds introduce a set of unique challenges for resource disaggregation across different tenants, including resource harvesting, isolation, and matching. Memtrade allows producer virtual machines (VMs) to lease both their unallocated memory and allocated-but-idle application memory to remote consumer VMs for a limited period of time. Memtrade does not require any modifications to host-level system software or support from the cloud provider. It harvests producer memory using an application-aware control loop to form a distributed transient remote memory pool with minimal performance impact; it employs a broker to match producers with consumers while satisfying performance constraints; and it exposes the matched memory to consumers through different abstractions. As a proof of concept, we propose two such memory access interfaces for Memtrade consumers -- a transient KV cache for specified applications and a swap interface that is application-transparent. Our evaluation using real-world cluster traces shows that Memtrade provides significant performance benefit for consumers (improving average read latency up to 2.8X) while preserving confidentiality and integrity, with little impact on producer applications (degrading performance by less than 2.1%).","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126013072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"POMACS V7, N2, June 2023 Editorial","authors":"K. Avrachenkov, P. Gill, B. Urgaonkar","doi":"10.1145/3589972","DOIUrl":"https://doi.org/10.1145/3589972","url":null,"abstract":"The Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS) focuses on the measurement and performance evaluation of computer systems and operates in close collaboration with the ACM Special Interest Group SIGMETRICS. All papers in this issue of POMACS will be presented at the ACM SIGMETRICS 2023 conference on June 19-23, 2023, in Orlando, Florida, USA. These papers have been selected during the winter submission round by the 91 members of the ACM SIGMETRICS 2023 program committee via a rigorous review process. Each paper was conditionally accepted (and shepherded), allowed a \"one-shot\" revision (to be resubmitted to one of the subsequent three SIGMETRICS deadlines), or rejected (with re-submission allowed after a year). For this issue, which represents the winter deadline, POMACS is publishing 11 papers out of 130 submissions. All submissions received at least 3 reviews and borderline cases were extensively discussed during the online program committee meeting. Based on the indicated track(s), roughly 28% of the submissions were in the Theory track, 44% were in the Measurement & Applied Modeling track, 43% were in the Systems track, and 22% were in the Learning track. Many individuals contributed to the success of this issue of POMACS. First, we would like to thank the authors, who submitted their best work to SIGMETRICS/POMACS. Second, we would like to thank the program committee members who provided constructive feedback in their reviews to authors and participated in the online discussions and program committee meeting. We also thank the several external reviewers who provided their expert opinion on specific submissions that required additional input. We are also grateful to the SIGMETRICS Board Chair, Giuliano Casale, and to past program committee Chairs, Niklas Carlsson, Edith Cohen, and Philippe Robert, who provided a wealth of information and guidance. Finally, we are grateful to the Organization Committee and to the SIGMETRICS Board for their ongoing efforts and initiatives for creating an exciting program for ACM SIGMETRICS 2023.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114352070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"JS Capsules: A Framework for Capturing Fine-grained JavaScript Memory Measurements for the Mobile Web","authors":"Usama Naseer, Theophilus A. Benson","doi":"10.1145/3579327","DOIUrl":"https://doi.org/10.1145/3579327","url":null,"abstract":"Understanding the resource consumption of the mobile web is an important topic that has garnered much attention in recent years. However, existing works mostly focus on the networking or computational aspects of the mobile web and largely ignore memory, which is an important aspect given the mobile web's reliance on resource-heavy JavaScript. In this paper, we propose a framework, called JS Capsules, for characterizing the memory of JavaScript functions and, using this framework, we investigate the key browser mechanics that contribute to the memory overhead. Leveraging our framework on a testbed of Android mobile phones, we conduct measurements of the Alexa top 1K websites. While most existing frameworks focus on V8 - the JavaScript engine used in most popular browsers - in the context of memory, our measurements show that the memory implications of JavaScript extends far beyond V8 due to the cascading effects that certain JavaScript calls have on the browser's rendering mechanics. We quantify and highlight the direct impact that website DOM have on JavaScript memory overhead and present, to our knowledge, the first root-cause analysis to dissect and characterize their impact on JavaScript memory overheads.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130087289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting and Measuring Security Risks of Hosting-Based Dangling Domains","authors":"Mingming Zhang, Xiang Li, Baojun Liu, Jianyu Lu, Yiming Zhang, Jianjun Chen, Haixin Duan, S. Hao, Xiaofeng Zheng","doi":"10.1145/3579440","DOIUrl":"https://doi.org/10.1145/3579440","url":null,"abstract":"Public hosting services provide convenience for domain owners to build web applications with better scalability and security. However, if a domain name points to released service endpoints (e.g., nameservers allocated by a provider), adversaries can take over the domain by applying the same endpoints. Such a security threat is called \"hosting-based domain takeover''. In recent years, a large number of domain takeover incidents have occurred; even well-known websites like the subdomains of microsoft.com have been impacted. However, until now, there has been no effective detection system to identify these vulnerable domains on a large scale. In this paper, we fill this research gap by presenting a novel framework, HostingChecker, for detecting domain takeovers. Compared with previous work, HostingChecker expands the detection scope and improves the detection efficiency by: (i) systematically identifying vulnerable hosting services using a semi-automated method; and (ii) effectively detecting vulnerable domains through passive reconstruction of domain dependency chains. The framework enables us to detect the subdomains of Tranco sites on a daily basis. We evaluate the effectiveness of HostingChecker and eventually detect 10,351 subdomains from Tranco Top-1M apex domains vulnerable to domain takeover, which are over 8× more than previous findings. Furthermore, we conduct an in-depth security analysis on the affected vendors, like Amazon and Alibaba, and gain a suite of new insights, including flawed implementation of domain ownership validation. Following responsible disclosure processes, we have reported issues to the security response centers of affected vendors, and some (e.g., Baidu and Tencent) have adopted our mitigation.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131255681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}