软件产业与工程最新文献_第5页

Minerva: browser API fuzzing with dynamic mod-ref analysis Minerva:浏览器API模糊测试与动态模式引用分析

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3549107

Chijin Zhou, Quan Zhang, Mingzhe Wang, Lihua Guo, Jie Liang, Zhe Liu, Mathias Payer, Yuting Jiang

{"title":"Minerva: browser API fuzzing with dynamic mod-ref analysis","authors":"Chijin Zhou, Quan Zhang, Mingzhe Wang, Lihua Guo, Jie Liang, Zhe Liu, Mathias Payer, Yuting Jiang","doi":"10.1145/3540250.3549107","DOIUrl":"https://doi.org/10.1145/3540250.3549107","url":null,"abstract":"Browser APIs are essential to the modern web experience. Due to their large number and complexity, they vastly expand the attack surface of browsers. To detect vulnerabilities in these APIs, fuzzers generate test cases with a large amount of random API invocations. However, the massive search space formed by arbitrary API combinations hinders their effectiveness: since randomly-picked API invocations unlikely interfere with each other (i.e., compute on partially shared data), few interesting API interactions are explored. Consequently, reducing the search space by revealing inter-API relations is a major challenge in browser fuzzing. We propose Minerva, an efficient browser fuzzer for browser API bug detection. The key idea is to leverage API interference relations to reduce redundancy and improve coverage. Minerva consists of two modules: dynamic mod-ref analysis and guided code generation. Before fuzzing starts, the dynamic mod-ref analysis module builds an API interference graph. It first automatically identifies individual browser APIs from the browser’s code base. Next, it instruments the browser to dynamically collect mod-ref relations between APIs. During fuzzing, the guided code generation module synthesizes highly-relevant API invocations guided by the mod-ref relations. We evaluate Minerva on three mainstream browsers, i.e. Safari, FireFox, and Chromium. Compared to state-of-the-art fuzzers, Minerva improves edge coverage by 19.63% to 229.62% and finds 2x to 3x more unique bugs. Besides, Minerva has discovered 35 previously-unknown bugs out of which 20 have been fixed with 5 CVEs assigned and acknowledged by browser vendors.","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87123057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Cross-device record and replay for Android apps Android应用程序的跨设备记录和重播

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3549083

Cong Li, Yanyan Jiang, Chang Xu

引用次数: 4

Multi-perspective representation learning for source code analytics (invited tutorial) 源代码分析的多角度表示学习(特邀教程)

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3569446

Zhi Jin

{"title":"Multi-perspective representation learning for source code analytics (invited tutorial)","authors":"Zhi Jin","doi":"10.1145/3540250.3569446","DOIUrl":"https://doi.org/10.1145/3540250.3569446","url":null,"abstract":"Programming languages are artificial and highly restricted languages. But source code is there to tell computers as well as programmers what to do, as an act of communication. Despite its weird syntax and is riddled with different delimiters, the good news is that the very large corpus of open-source code is available. That makes it reasonable to apply machine learning techniques to source code to enable the source code analytics. Despite there are plenty of deep learning frameworks in the field of NLP, source code analytics has different features. In addition to the conventional way of coding, understanding the meaning of code involves many perspectives. The source code representation could be the token sequence, the API call sequence, the data dependency graph, and the control flow graph, as well as the program hierarchy, etc. This tutorial will tell the long, ongoing, and fruitful journey on exploiting the potential power of deep learning techniques in source code analytics. It will highlight that how code representation models can be utilized to support software engineers to perform different tasks that require proficient programming knowledge. The exploratory work show that code does imply the learnable knowledge, more precisely the learnable tacit knowledge. Although such knowledge is not easily transferrable between humans, it can be transferred between the automated programming tasks. A vision for future research will be stated for source code analytics.","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86629648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Security code smells in apps: are we getting better? 应用程序的安全代码气味:我们正在变得更好吗?

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3549091

Steven Arzt

{"title":"Security code smells in apps: are we getting better?","authors":"Steven Arzt","doi":"10.1145/3540250.3549091","DOIUrl":"https://doi.org/10.1145/3540250.3549091","url":null,"abstract":"Users increasingly rely on mobile apps for everyday tasks, including security- and privacy-sensitive tasks such as online banking, e-health, and e-government. Additionally, a wealth of sensors captures the movements and habits of the users for fitness tracking and convenience. Despite legal regulations imposing requirements and limits on the processing of privacy-sensitive data, users must still trust the app developers to apply suffcient protections. In this paper, we investigate the state of security in Android apps and how security-related code smells have evolved since the introduction of the Android operating system. With an analysis of 300 apps per year over 12 years between 2010 and 2021 from the Google Play Store, we find that the number of code scanner findings per thousand lines of code decreases over time. Still, this development is offset by the increase in code size. Apps have more and more findings, suggesting that the overall security level decreases. This trend is driven by flaws in the use of cryptography, insecure compiler flags, insecure uses of WebView components, and insecure uses of language features such as reflection. Based on our data, we argue for stricter controls on apps before admission to the store.","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":"53 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83504917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Pair programming conversations with agents vs. developers: challenges and opportunities for SE community 与代理和开发人员的结对编程对话:SE社区的挑战和机遇

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3549127

Peter Robe, S. Kuttal, J. AuBuchon, Jacob C. Hart

{"title":"Pair programming conversations with agents vs. developers: challenges and opportunities for SE community","authors":"Peter Robe, S. Kuttal, J. AuBuchon, Jacob C. Hart","doi":"10.1145/3540250.3549127","DOIUrl":"https://doi.org/10.1145/3540250.3549127","url":null,"abstract":"Recent research has shown feasibility of an interactive pair-programming conversational agent, but implementing such an agent poses three challenges: a lack of benchmark datasets, absence of software engineering specific labels, and the need to understand developer conversations. To address these challenges, we conducted a Wizard of Oz study with 14 participants pair programming with a simulated agent and collected 4,443 developer-agent utterances. Based on this dataset, we created 26 software engineering labels using an open coding process to develop a hierarchical classification scheme. To understand labeled developer-agent conversations, we compared the accuracy of three state-of-the-art transformer-based language models, BERT, GPT-2, and XLNet, which performed interchangeably. In order to begin creating a developer-agent dataset, researchers and practitioners need to conduct resource intensive Wizard of Oz studies. Presently, there exists vast amounts of developer-developer conversations on video hosting websites. To investigate the feasibility of using developer-developer conversations, we labeled a publicly available developer-developer dataset (3,436 utterances) with our hierarchical classification scheme and found that a BERT model trained on developer-developer data performed ~10% worse than the BERT trained on developer-agent data, but when using transfer-learning, accuracy improved. Finally, our qualitative analysis revealed that developer-developer conversations are more implicit, neutral, and opinionated than developer-agent conversations. Our results have implications for software engineering researchers and practitioners developing conversational agents.","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90327735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

GFI-bot: automated good first issue recommendation on GitHub GFI-bot:在GitHub上自动推荐好的第一个问题

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3558922

Hao He, Haonan Su, Wenxin Xiao, Runzhi He, Minghui Zhou

引用次数: 3

SemCluster: a semi-supervised clustering tool for crowdsourced test reports with deep image understanding SemCluster:一个半监督的聚类工具，用于具有深度图像理解的众包测试报告

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3558933

Mingzhe Du, Shengcheng Yu, Chunrong Fang, Tongyu Li, Heyuan Zhang, Zhenyu Chen

{"title":"SemCluster: a semi-supervised clustering tool for crowdsourced test reports with deep image understanding","authors":"Mingzhe Du, Shengcheng Yu, Chunrong Fang, Tongyu Li, Heyuan Zhang, Zhenyu Chen","doi":"10.1145/3540250.3558933","DOIUrl":"https://doi.org/10.1145/3540250.3558933","url":null,"abstract":"Due to the openness of crowdsourced testing, mobile app crowdsourced testing has been subject to duplicate reports. The previous research methods extract the textual features of the crowdsourced test reports, combine with shallow image analysis, and perform unsupervised clustering on the crowdsourced test reports to clarify the duplication of crowdsourced test reports and solve the problem. However, these methods ignore the semantic connection between textual descriptions and screenshots, making the clustering results unsatisfactory and the deduplication effect less accurate. This paper proposes a semi-supervised clustering tool for crowdsourced test reports with deep image understanding, namely SemCluster, which makes the most of the semantic connection between textual descriptions and screenshots by constructing semantic binding rules and performing semi-supervised clustering. SemCluster improves six metrics of clustering results in the experiment compared to the state-of-the-art method, which verifies that SemCluster has achieved a good deduplication effect. The demo can be found at: https://sites.google.com/view/semcluster-demo.","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74919479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

PyTER: effective program repair for Python type errors PyTER: Python类型错误的有效程序修复

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3549130

Wonseok Oh, Hakjoo Oh

引用次数: 3

Online testing of RESTful APIs: promises and challenges RESTful api的在线测试:承诺与挑战

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3549144

Alberto Martin-Lopez, Sergio Segura, Antonio Ruiz-Cortés

{"title":"Online testing of RESTful APIs: promises and challenges","authors":"Alberto Martin-Lopez, Sergio Segura, Antonio Ruiz-Cortés","doi":"10.1145/3540250.3549144","DOIUrl":"https://doi.org/10.1145/3540250.3549144","url":null,"abstract":"Online testing of web APIs—testing APIs in production—is gaining traction in industry. Platforms such as RapidAPI and Sauce Labs provide online testing and monitoring services of web APIs 24/7, typically by re-executing manually designed test cases on the target APIs on a regular basis. In parallel, research on the automated generation of test cases for RESTful APIs has seen significant advances in recent years. However, despite their promising results in the lab, it is unclear whether research tools would scale to industrial-size settings and, more importantly, how they would perform in an online testing setup, increasingly common in practice. In this paper, we report the results of an empirical study on the use of automated test case generation methods for online testing of RESTful APIs. Specifically, we used the RESTest framework to automatically generate and execute test cases in 13 industrial APIs for 15 days non-stop, resulting in over one million test cases. To scale at this level, we had to transition from a monolithic tool approach to a multi-bot architecture with over 200 bots working cooperatively in tasks like test generation and reporting. As a result, we uncovered about 390K failures, which we conservatively triaged into 254 bugs, 65 of which have been acknowledged or fixed by developers to date. Among others, we identified confirmed faults in the APIs of Amadeus, Foursquare, Yelp, and YouTube, accessed by millions of applications worldwide. More importantly, our reports have guided developers on improving their APIs, including bug fixes and documentation updates in the APIs of Amadeus and YouTube. Our results show the potential of online testing of RESTful APIs as the next must-have feature in industry, but also some of the key challenges to overcome for its full adoption in practice.","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74352645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Trace analysis based microservice architecture measurement 基于跟踪分析的微服务架构度量

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3558951

Xin Peng, Chenxi Zhang, Zhongyuan Zhao, Akasaka Isami, Xiaofeng Guo, Yunna Cui

{"title":"Trace analysis based microservice architecture measurement","authors":"Xin Peng, Chenxi Zhang, Zhongyuan Zhao, Akasaka Isami, Xiaofeng Guo, Yunna Cui","doi":"10.1145/3540250.3558951","DOIUrl":"https://doi.org/10.1145/3540250.3558951","url":null,"abstract":"Microservice architecture design highly relies on expert experience and may often result in improper service decomposition. Moreover, a microservice architecture is likely to degrade with the continuous evolution of services. Architecture measurement is thus important for the long-term evolution of microservice architectures. Due to the independent and dynamic nature of services, source code analysis based approaches cannot well capture the interactions between services. In this paper, we propose a trace analysis based microservice architecture measurement approach. We define a trace data model for microservice architecture measurement, which enables fine-grained analysis of the execution processes of requests and the interactions between interfaces and services. Based on the data model, we define 14 architectural metrics to measure the service independence and invocation chain complexity of a microservice system. We implement the approach and conduct three case studies with a student course project, an open-source microservice benchmark system, and three industrial microservice systems. The results show that our approach can well characterize the independence and invocation chain complexity of microservice architectures and help developers to identify microservice architecture issues caused by improper service decomposition and architecture degradation.","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74563266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2