{"title":"数据密集型系统中的数据访问性能反模式","authors":"Biruk Asmare Muse, Kawser Wazed Nafi, Foutse Khomh, Giuliano Antoniol","doi":"10.1007/s10664-024-10535-8","DOIUrl":null,"url":null,"abstract":"<p>Data-intensive systems handle variable, high-volume, and high-velocity data generated by human and digital devices. Like traditional software, data-intensive systems are prone to technical debts introduced to cope-up with the pressure of time and resource constraints on developers. Data-access is a critical component of data-intensive systems, as it determines their overall performance and functionality. While data access technical debts are getting attention from the research community, technical debts that affect performance are not well investigated. This study aims to identify, categorize, and validate data-access performance anti-patterns. We collected issues from NoSQL-based and polyglot persistence open-source data-intensive systems, implemented in Java programing language, and identified 14 new data access-performance anti-patterns categorized under seven high-level categories. We conducted a developer survey to evaluate the perceived relevance and criticality of the newly identified anti-patterns and found that <i>Improper Handling of Node Failures</i>, <i>Using Synchronous Connection</i>, and <i>Inefficient Driver API</i> performance anti-patterns are the most critical data-access performance anti-patterns. The study findings can help improve the quality of data-intensive software systems by raising awareness of practitioners about the impact of the data-access performance anti-patterns. At the same time, the findings will help quality assurance teams to prioritize the correction of performance anti-patterns based on their criticality.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"19 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data-access performance anti-patterns in data-intensive systems\",\"authors\":\"Biruk Asmare Muse, Kawser Wazed Nafi, Foutse Khomh, Giuliano Antoniol\",\"doi\":\"10.1007/s10664-024-10535-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Data-intensive systems handle variable, high-volume, and high-velocity data generated by human and digital devices. Like traditional software, data-intensive systems are prone to technical debts introduced to cope-up with the pressure of time and resource constraints on developers. Data-access is a critical component of data-intensive systems, as it determines their overall performance and functionality. While data access technical debts are getting attention from the research community, technical debts that affect performance are not well investigated. This study aims to identify, categorize, and validate data-access performance anti-patterns. We collected issues from NoSQL-based and polyglot persistence open-source data-intensive systems, implemented in Java programing language, and identified 14 new data access-performance anti-patterns categorized under seven high-level categories. We conducted a developer survey to evaluate the perceived relevance and criticality of the newly identified anti-patterns and found that <i>Improper Handling of Node Failures</i>, <i>Using Synchronous Connection</i>, and <i>Inefficient Driver API</i> performance anti-patterns are the most critical data-access performance anti-patterns. The study findings can help improve the quality of data-intensive software systems by raising awareness of practitioners about the impact of the data-access performance anti-patterns. At the same time, the findings will help quality assurance teams to prioritize the correction of performance anti-patterns based on their criticality.</p>\",\"PeriodicalId\":11525,\"journal\":{\"name\":\"Empirical Software Engineering\",\"volume\":\"19 1\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Empirical Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10664-024-10535-8\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Empirical Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10664-024-10535-8","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Data-access performance anti-patterns in data-intensive systems
Data-intensive systems handle variable, high-volume, and high-velocity data generated by human and digital devices. Like traditional software, data-intensive systems are prone to technical debts introduced to cope-up with the pressure of time and resource constraints on developers. Data-access is a critical component of data-intensive systems, as it determines their overall performance and functionality. While data access technical debts are getting attention from the research community, technical debts that affect performance are not well investigated. This study aims to identify, categorize, and validate data-access performance anti-patterns. We collected issues from NoSQL-based and polyglot persistence open-source data-intensive systems, implemented in Java programing language, and identified 14 new data access-performance anti-patterns categorized under seven high-level categories. We conducted a developer survey to evaluate the perceived relevance and criticality of the newly identified anti-patterns and found that Improper Handling of Node Failures, Using Synchronous Connection, and Inefficient Driver API performance anti-patterns are the most critical data-access performance anti-patterns. The study findings can help improve the quality of data-intensive software systems by raising awareness of practitioners about the impact of the data-access performance anti-patterns. At the same time, the findings will help quality assurance teams to prioritize the correction of performance anti-patterns based on their criticality.
期刊介绍:
Empirical Software Engineering provides a forum for applied software engineering research with a strong empirical component, and a venue for publishing empirical results relevant to both researchers and practitioners. Empirical studies presented here usually involve the collection and analysis of data and experience that can be used to characterize, evaluate and reveal relationships between software development deliverables, practices, and technologies. Over time, it is expected that such empirical results will form a body of knowledge leading to widely accepted and well-formed theories.
The journal also offers industrial experience reports detailing the application of software technologies - processes, methods, or tools - and their effectiveness in industrial settings.
Empirical Software Engineering promotes the publication of industry-relevant research, to address the significant gap between research and practice.