软件产业与工程最新文献_第10页

Performing large-scale mining studies: from start to finish (tutorial) 执行大规模采矿研究:从头到尾(教程)

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3569448

Robert Dyer, Samuel W. Flint

引用次数: 0

This is your cue! assisting search behaviour with resource style properties 这是你的提示!使用资源样式属性协助搜索行为

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3558909

Deeksha M. Arya

引用次数: 0

An empirical study of log analysis at Microsoft 微软日志分析的实证研究

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3558963

Shilin He, Xu Zhang, Pinjia He, Yong Xu, Liqun Li, Yu Kang, Minghua Ma, Yining Wei, Yingnong Dang, S. Rajmohan, Qingwei Lin

{"title":"An empirical study of log analysis at Microsoft","authors":"Shilin He, Xu Zhang, Pinjia He, Yong Xu, Liqun Li, Yu Kang, Minghua Ma, Yining Wei, Yingnong Dang, S. Rajmohan, Qingwei Lin","doi":"10.1145/3540250.3558963","DOIUrl":"https://doi.org/10.1145/3540250.3558963","url":null,"abstract":"Logs are crucial to the management and maintenance of software systems. In recent years, log analysis research has achieved notable progress on various topics such as log parsing and log-based anomaly detection. However, the real voices from front-line practitioners are seldom heard. For example, what are the pain points of log analysis in practice? In this work, we conduct a comprehensive survey study on log analysis at Microsoft. We collected feedback from 105 employees through a questionnaire of 13 questions and individual interviews with 12 employees. We summarize the format, scenario, method, tool, and pain points of log analysis. Additionally, by comparing the industrial practices with academic research, we discuss the gaps between academia and industry, and future opportunities on log analysis with four inspiring findings. Particularly, we observe a huge gap exists between log anomaly detection research and failure alerting practices regarding the goal, technique, efficiency, etc. Moreover, data-driven log parsing, which has been widely studied in recent research, can be alternatively achieved by simply logging template IDs during software development. We hope this paper could uncover the real needs of industrial practitioners and the unnoticed yet significant gap between industry and academia, and inspire interesting future directions that converge efforts from both sides.","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73281592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Static executes-before analysis for event driven programs 静态执行-在分析事件驱动程序之前

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3549116

Rekha R. Pai, Abhishek Uppar, Akshatha Shenoy, Pranshul Kushwaha, D. D'Souza

引用次数: 0

On the vulnerability proneness of multilingual code 多语言代码的脆弱性研究

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3549173

Wen Li, Li Li, Haipeng Cai

引用次数: 8

SPINE: a scalable log parser with feedback guidance SPINE:带有反馈指导的可伸缩日志解析器

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3549176

Xuheng Wang, Xu Zhang, Liqun Li, Shilin He, Hongyu Zhang, Yudong Liu, Ling Zheng, Yu Kang, Qingwei Lin, Yingnong Dang, S. Rajmohan, Dongmei Zhang

{"title":"SPINE: a scalable log parser with feedback guidance","authors":"Xuheng Wang, Xu Zhang, Liqun Li, Shilin He, Hongyu Zhang, Yudong Liu, Ling Zheng, Yu Kang, Qingwei Lin, Yingnong Dang, S. Rajmohan, Dongmei Zhang","doi":"10.1145/3540250.3549176","DOIUrl":"https://doi.org/10.1145/3540250.3549176","url":null,"abstract":"Log parsing, which extracts log templates and parameters, is a critical prerequisite step for automated log analysis techniques. Though existing log parsers have achieved promising accuracy on public log datasets, they still face many challenges when applied in the industry. Through studying the characteristics of real-world log data and analyzing the limitations of existing log parsers, we identify two problems. Firstly, it is non-trivial to scale a log parser to a vast number of logs, especially in real-world scenarios where the log data is extremely imbalanced. Secondly, existing log parsers overlook the importance of user feedback, which is imperative for parser fine-tuning under the continuous evolution of log data. To overcome the challenges, we propose SPINE, which is a highly scalable log parser with user feedback guidance. Based on our log parser equipped with initial grouping and progressive clustering,we propose a novel log data scheduling algorithm to improve the efficiency of parallelization under the large-scale imbalanced log data. Besides, we introduce user feedback to make the parser fast adapt to the evolving logs. We evaluated SPINE on 16 public log datasets. SPINE achieves more than 0.90 parsing accuracy on average with the highest parsing efficiency, which outperforms the state-of-the-art log parsers. We also evaluated SPINE in the production environment of Microsoft, in which SPINE can parse 30million logs in less than 8 minutes under 16 executors, achieving near real-time performance. In addition, our evaluations show that SPINE can consistently achieve good accuracy under log evolution with a moderate number of user feedback.","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":"129 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79187615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

FastKLEE: faster symbolic execution via reducing redundant bound checking of type-safe pointers FastKLEE:通过减少对类型安全指针的冗余边界检查来加快符号执行

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3558919

Haoxin Tu, Lingxiao Jiang, Xuhua Ding, He Jiang

{"title":"FastKLEE: faster symbolic execution via reducing redundant bound checking of type-safe pointers","authors":"Haoxin Tu, Lingxiao Jiang, Xuhua Ding, He Jiang","doi":"10.1145/3540250.3558919","DOIUrl":"https://doi.org/10.1145/3540250.3558919","url":null,"abstract":"Symbolic execution (SE) has been widely adopted for automatic program analysis and software testing. Many SE engines (e.g., KLEE or Angr) need to interpret certain Intermediate Representations (IR) of code during execution, which may be slow and costly. Although a plurality of studies proposed to accelerate SE, few of them consider optimizing the internal interpretation operations. In this paper, we propose FastKLEE, a faster SE engine that aims to speed up execution via reducing redundant bound checking of type-safe pointers during IR code interpretation. Specifically, in FastKLEE, a type inference system is first leveraged to classify pointer types (i.e., safe or unsafe) for the most frequently interpreted read/write instructions. Then, a customized memory operation is designed to perform bound checking for only the unsafe pointers and omit redundant checking on safe pointers. We implement FastKLEE on top of the well-known SE engine KLEE and combined it with the notable type inference system CCured. Evaluation results demonstrate that FastKLEE is able to reduce by up to 9.1% (5.6% on average) as the state-of-the-art approach KLEE in terms of the time to explore the same number (i.e., 10k) of execution paths. FastKLEE is opensourced at https://github.com/haoxintu/FastKLEE. A video demo of FastKLEE is available at https://youtu.be/fjV_a3kt-mo.","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72833744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

The evolution of type annotations in python: an empirical study python中类型注释的演变:一个实证研究

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3549114

L. Grazia, Michael Pradel

{"title":"The evolution of type annotations in python: an empirical study","authors":"L. Grazia, Michael Pradel","doi":"10.1145/3540250.3549114","DOIUrl":"https://doi.org/10.1145/3540250.3549114","url":null,"abstract":"Type annotations and gradual type checkers attempt to reveal errors and facilitate maintenance in dynamically typed programming languages. Despite the availability of these features and tools, it is currently unclear how quickly developers are adopting them, what strategies they follow when doing so, and whether adding type annotations reveals more type errors. This paper presents the first large-scale empirical study of the evolution of type annotations and type errors in Python. The study is based on an analysis of 1,414,936 type annotation changes, which we extract from 1,123,393 commits among 9,655 projects. Our results show that (i) type annotations are getting more popular, and once added, often remain unchanged in the projects for a long time, (ii) projects follow three evolution patterns for type annotation usage -- regular annotation, type sprints, and occasional uses -- and that the used pattern correlates with the number of contributors, (iii) more type annotations help find more type errors (0.704 correlation), but nevertheless, many commits (78.3%) are committed despite having such errors. Our findings show that better developer training and automated techniques for adding type annotations are needed, as most code still remains unannotated, and they call for a better integration of gradual type checking into the development process.","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":"143 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74034077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Effective and scalable fault injection using bug reports and generative language models 使用bug报告和生成语言模型的有效和可伸缩的故障注入

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3558907

Ahmed Khanfir

{"title":"Effective and scalable fault injection using bug reports and generative language models","authors":"Ahmed Khanfir","doi":"10.1145/3540250.3558907","DOIUrl":"https://doi.org/10.1145/3540250.3558907","url":null,"abstract":"Previous research has shown that artificial faults can be useful in many software engineering tasks such as testing, fault-tolerance assessment, debugging, dependability evaluation, risk analysis, etc. However, such artificial-fault-based applications can be questioned or inaccurate when the considered faults misrepresent real bugs. Since typically, fault injection techniques (i.e. mutation testing) produce a large number of faults by altering ”blindly” the code in arbitrary locations, they are unlikely capable to produce few but relevant real-like faults. In our work, we tackle this challenge by guiding the injection towards resembling bugs that have been previously introduced by developers. For this purpose, we propose iBiR, the first fault injection approach that leverages information from bug reports to inject ”realistic” faults. iBiR injects faults on the locations that are more likely to be related to a given bug-report by applying appropriate inverted fix-patterns, which are manually or automatically crafted by automated-program-repair researchers. We assess our approach using bugs from the Defects4J dataset and show that iBiR outperforms significantly conventional mutation testing in terms of injecting faults that semantically resemble and couple with real ones, in the vast majority of the cases. Similarly, the faults produced by iBiR give significantly better fault-tolerance estimates than conventional mutation testing in around 80% of the cases.","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74095164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Do bugs lead to unnaturalness of source code? bug会导致源代码的不自然吗?

软件产业与工程 Pub Date : 2022-11-07 DOI: 10.1145/3540250.3549149

Yanjie Jiang, Hui Liu, Yuxia Zhang, Weixing Ji, Hao Zhong, Lu Zhang

{"title":"Do bugs lead to unnaturalness of source code?","authors":"Yanjie Jiang, Hui Liu, Yuxia Zhang, Weixing Ji, Hao Zhong, Lu Zhang","doi":"10.1145/3540250.3549149","DOIUrl":"https://doi.org/10.1145/3540250.3549149","url":null,"abstract":"Texts in natural languages are highly repetitive and predictable because of the naturalness of natural languages. Recent research validated that source code in programming languages is also repetitive and predictable, and naturalness is an inherent property of source code. It was also reported that buggy code is significantly less natural than bug-free one, and bug fixing substantially improves the naturalness of the involved source code. In this paper, we revisit the naturalness of buggy code and investigate the effect of bug-fixing on the naturalness of source code. Different from the existing investigation, we leverage two large-scale and high-quality bug repositories where bug-irrelevant changes in bug-fixing commits have been explicitly excluded. Our evaluation results confirm that buggy lines are often less natural than bug-free ones. However, fixing bugs could not significantly improve the naturalness of involved code lines. Fixed lines on average are as unnatural as buggy ones. Consequently, bugs are not the root cause of the unnaturalness of source code, and it could be inaccurate to identify buggy code lines solely by the naturalness of source code. Our evaluation results suggest that the naturalness-based buggy line detection results in extremely low precision (less than one percentage).","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":"331 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77595836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2