Information and Software Technology最新文献

筛选
英文 中文
Production and test bug report classification based on transfer learning 基于迁移学习的生产和测试错误报告分类
IF 3.8 2区 计算机科学
Information and Software Technology Pub Date : 2025-02-11 DOI: 10.1016/j.infsof.2025.107685
Misoo Kim , Youngkyoung Kim , Eunseok Lee
{"title":"Production and test bug report classification based on transfer learning","authors":"Misoo Kim ,&nbsp;Youngkyoung Kim ,&nbsp;Eunseok Lee","doi":"10.1016/j.infsof.2025.107685","DOIUrl":"10.1016/j.infsof.2025.107685","url":null,"abstract":"<div><h3>Context:</h3><div>Recent studies indicate that the classification of production and test bug reports can substantially enhance the accuracy of performance evaluation and the effectiveness of information retrieval–based bug localization (IRBL) for software reliability.</div></div><div><h3>Objective:</h3><div>However, manually classifying these bug reports is time-consuming for developers. This study introduces a production and test bug report classification (ProTeC) framework for automatically classifying these reports.</div></div><div><h3>Methods:</h3><div>The framework’s novelty lies in leveraging a set of production- and test-source files and employing transfer learning to address the issue of insufficient and sparse bug reports in machine-learning applications. The ProTeC framework trains and fine-tunes a source file classifier to develop a bug report classifier by transferring production-test distinguishing knowledge.</div></div><div><h3>Results:</h3><div>To validate the effectiveness and general practicality of ProTeC, we conducted large-scale experiments using 2,522 bug reports across 12 machine/deep learning model variations to train an automatic classifier. Our results, on average, demonstrate that ProTeC’s macro F1-score is 28.6% higher than that of a bug report-based classifier, and it can improve the mean average precision of IRBL by 17.6%.</div></div><div><h3>Conclusion:</h3><div>These positive trends were observed in most model variations, indicating that ProTeC consistently performs well in classifying bug reports regardless of the model used, thereby improving IRBL performance.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"181 ","pages":"Article 107685"},"PeriodicalIF":3.8,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143387324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vulnerability detection with feature fusion and learnable edge-type embedding graph neural network
IF 3.8 2区 计算机科学
Information and Software Technology Pub Date : 2025-02-10 DOI: 10.1016/j.infsof.2025.107686
Ge Cheng , Qifan Luo , Yun Zhang
{"title":"Vulnerability detection with feature fusion and learnable edge-type embedding graph neural network","authors":"Ge Cheng ,&nbsp;Qifan Luo ,&nbsp;Yun Zhang","doi":"10.1016/j.infsof.2025.107686","DOIUrl":"10.1016/j.infsof.2025.107686","url":null,"abstract":"<div><div>Deep learning methods are widely employed in vulnerability detection, and graph neural networks have shown effectiveness in learning source code representation. However, current methods overlook non-relevant noise information in the code property graph and lack specific graph neural networks designed for code property graph. To address these issues, this paper introduces Leev, an automated vulnerability detection method. We developed a graph neural network tailored to the code property graph, assigning iterative vectors to diverse edge types and integrating them into the message passing between nodes to enable the model to extract hidden vulnerability information. In addition, virtual nodes are incorporated into the graph for feature fusion, mitigating the impact of irrelevant features on vulnerability information within the code. Specifically, for the FFMPeg+Qemu, Reveal, and Fan et al. datasets, the F1 metrics exhibited improvements of 7.02%, 21.69%, and 27.74% over the best baseline, correspondingly.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"181 ","pages":"Article 107686"},"PeriodicalIF":3.8,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143402661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Practical assessment of the e-commerce multivariant user interface
IF 3.8 2区 计算机科学
Information and Software Technology Pub Date : 2025-02-10 DOI: 10.1016/j.infsof.2025.107684
Adam Wasilewski , Elżbieta Pawełek-Lubera
{"title":"Practical assessment of the e-commerce multivariant user interface","authors":"Adam Wasilewski ,&nbsp;Elżbieta Pawełek-Lubera","doi":"10.1016/j.infsof.2025.107684","DOIUrl":"10.1016/j.infsof.2025.107684","url":null,"abstract":"<div><h3>Context:</h3><div>Personalization is recognized as one of the key trends in e-commerce development, often including the personalization of offers and prices. However, a rarely used and underestimated personalization opportunity is the customization of the user interface provided to customers. Customers of e-shops differ in their behaviors and usage patterns, yet there is no clear evidence verifying the potential of the user interface to influence the performance indicators of e-shops.</div></div><div><h3>Objective:</h3><div>The research discussed in this paper aims to verify the impact of a dedicated interface on the most common indicators describing e-commerce performance and to identify limitations to the use of user interface personalization in e-commerce.</div></div><div><h3>Method:</h3><div>To achieve this, a solution was developed to collect information about e-commerce customer behavior, segment customers using clustering methods, and provide a dedicated user interface. During the pilot implementation, data was collected to verify the impact of the dedicated interface on the purchasing behavior of customer groups.</div></div><div><h3>Results:</h3><div>The results showed that a dedicated interface can significantly improve the conversion rate(by 46% in the analyzed group) and average order value (11%).</div></div><div><h3>Conclusion:</h3><div>These findings confirm that tailored UI variants can positively influence customer behavior in e-shops by increasing key performance indicators.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"181 ","pages":"Article 107684"},"PeriodicalIF":3.8,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143379467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the means to measure explainability: Metrics, heuristics and questionnaires
IF 3.8 2区 计算机科学
Information and Software Technology Pub Date : 2025-02-08 DOI: 10.1016/j.infsof.2025.107682
Hannah Deters, Jakob Droste, Martin Obaidi, Kurt Schneider
{"title":"Exploring the means to measure explainability: Metrics, heuristics and questionnaires","authors":"Hannah Deters,&nbsp;Jakob Droste,&nbsp;Martin Obaidi,&nbsp;Kurt Schneider","doi":"10.1016/j.infsof.2025.107682","DOIUrl":"10.1016/j.infsof.2025.107682","url":null,"abstract":"<div><h3>Context:</h3><div>As the complexity of modern software is steadily growing, these systems become increasingly difficult to understand for their stakeholders. At the same time, opaque and artificially intelligent systems permeate a growing number of safety-critical areas, such as medicine and finance. As a result, explainability is becoming more important as a software quality aspect and non-functional requirement.</div></div><div><h3>Objective:</h3><div>Contemporary research has mainly focused on making artificial intelligence and its decision-making processes more understandable. However, explainability has also gained traction in recent requirements engineering research. This work aims to contribute to that body of research by providing a quality model for explainability as a software quality aspect. Quality models provide means and measures to specify and evaluate quality requirements.</div></div><div><h3>Method:</h3><div>In order to design a user-centered quality model for explainability, we conducted a literature review.</div></div><div><h3>Results:</h3><div>We identified ten fundamental aspects of explainability. Furthermore, we aggregated criteria and metrics to measure them as well as alternative means of evaluation in the form of heuristics and questionnaires.</div></div><div><h3>Conclusion:</h3><div>Our quality model and the related means of evaluation enable software engineers to develop and validate explainable systems in accordance with their explainability goals and intentions. This is achieved by offering a view from different angles at fundamental aspects of explainability and the related development goals. Thus, we provide a foundation that improves the management and verification of explainability requirements.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"181 ","pages":"Article 107682"},"PeriodicalIF":3.8,"publicationDate":"2025-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143402662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Process mining for agile software process assessment and improvement
IF 3.8 2区 计算机科学
Information and Software Technology Pub Date : 2025-02-07 DOI: 10.1016/j.infsof.2025.107680
Katiane Oliveira Alpes da Silva , Ricardo Massa Ferreira Lima , Vanderson Botelho da Silva
{"title":"Process mining for agile software process assessment and improvement","authors":"Katiane Oliveira Alpes da Silva ,&nbsp;Ricardo Massa Ferreira Lima ,&nbsp;Vanderson Botelho da Silva","doi":"10.1016/j.infsof.2025.107680","DOIUrl":"10.1016/j.infsof.2025.107680","url":null,"abstract":"<div><h3>Context:</h3><div>Agile software processes, designed for flexibility and continuous improvement, pose challenges in extracting actionable insights from event logs due to their inherent unstructured nature.</div></div><div><h3>Objective:</h3><div>The study evaluates whether existing process mining techniques can effectively uncover reliable and insightful information on software development processes adopting agile methodologies.</div></div><div><h3>Method:</h3><div>The work uses various algorithms to analyze procedural flows and business rules within an event log containing data from 3,418 agile software development projects at a company with over 1,500 employees. By categorizing processes according to project size, our analysis aimed to determine the kind of insights these algorithms could reveal. We specifically focused on algorithms that produced high-quality insights for a deeper examination of aspects like effort rate, frequency of activities, and relationships between activities. Subsequently, technical and managerial staff reviewed the results to assess the quality and relevance of the insights generated. Validation involved a semi-structured interview with managers and technicians to ensure the relevance and applicability of the findings.</div></div><div><h3>Results:</h3><div>The analysis demonstrates the efficacy of declarative business process techniques in extracting actionable insights from agile development teams’ data. Such techniques accurately capture the daily routines and documented processes of the teams. High-performing teams typically followed fewer rules, had less job rotation, involved fewer individuals, and engaged in a more limited range of activities. Domain experts and team managers found these insights to be coherent and potentially valuable for enhancing the performance of software development processes.</div></div><div><h3>Conclusions:</h3><div>Declarative modeling is particularly adept at revealing the patterns of flexible software development workflows, presenting initial support for teams, managers, and decision-makers through both descriptive and prescriptive analysis.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"181 ","pages":"Article 107680"},"PeriodicalIF":3.8,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143395235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Re-evaluating metamorphic testing of chess engines: A replication study
IF 3.8 2区 计算机科学
Information and Software Technology Pub Date : 2025-02-06 DOI: 10.1016/j.infsof.2025.107679
Axel Martin , Djamel Eddine Khelladi , Théo Matricon , Mathieu Acher
{"title":"Re-evaluating metamorphic testing of chess engines: A replication study","authors":"Axel Martin ,&nbsp;Djamel Eddine Khelladi ,&nbsp;Théo Matricon ,&nbsp;Mathieu Acher","doi":"10.1016/j.infsof.2025.107679","DOIUrl":"10.1016/j.infsof.2025.107679","url":null,"abstract":"<div><h3>Context:</h3><div>This study aims to confirm, replicate and extend the findings of a previous article entitled <em>”Metamorphic Testing of Chess Engines”</em> that reported inconsistencies in the analyses provided by <em>Stockfish</em>, the most widely used chess engine, for transformed chess positions that are fundamentally identical. Initial findings, under conditions strictly identical to those of the original study, corroborate the reported inconsistencies.</div></div><div><h3>Objective:</h3><div>However, the original article considers a specific dataset (including randomly generated chess positions, end-games, or checkmate problems) and very low analysis depth (10 plies,<span><span><sup>1</sup></span></span> corresponding to 5 moves). These decisions pose threats that limit generalizability of the results, but also their practical usefulness both for chess players and maintainers of Stockfish. Thus, we replicate the original study.</div></div><div><h3>Methods:</h3><div>We consider this time (1) positions derived from actual chess games, (2) analyses at appropriate and larger depths, and (3) different versions of Stockfish. We conduct novel experiments on thousands of positions, employing significantly deeper searches.</div></div><div><h3>Results:</h3><div>The replication results show that the Stockfish chess engines demonstrate significantly greater consistency in its evaluations. The metamorphic relations are not as effective as in the original article, especially on realistic chess positions. We also demonstrate that, for any given position, there exists a depth threshold beyond which further increases in depth do not result in any evaluation differences for the studied metamorphic relations. We perform an in-depth analysis to identify and clarify the implementation reasons behind Stockfish’s inconsistencies when dealing with transformed positions.</div></div><div><h3>Conclusion:</h3><div>A first concrete result is thus that metamorphic testing of chess engines is not yet an effective technique for finding faults of Stockfish. Another result is the lessons learned through this replication effort: metamorphic relations must be verified in the context of the domain’s specificities; without such contextual validation, they may lead to misleading or irrelevant conclusions; changes in parameters and input dataset can drastically alter the effectiveness of a testing method.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"181 ","pages":"Article 107679"},"PeriodicalIF":3.8,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143395197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Test smell: A parasitic energy consumer in software testing
IF 3.8 2区 计算机科学
Information and Software Technology Pub Date : 2025-02-03 DOI: 10.1016/j.infsof.2025.107671
Md Rakib Hossain Misu , Jiawei Li , Adithya Bhattiprolu , Yang Liu , Eduardo Santana de Almeida , Iftekhar Ahmed
{"title":"Test smell: A parasitic energy consumer in software testing","authors":"Md Rakib Hossain Misu ,&nbsp;Jiawei Li ,&nbsp;Adithya Bhattiprolu ,&nbsp;Yang Liu ,&nbsp;Eduardo Santana de Almeida ,&nbsp;Iftekhar Ahmed","doi":"10.1016/j.infsof.2025.107671","DOIUrl":"10.1016/j.infsof.2025.107671","url":null,"abstract":"<div><h3>Context:</h3><div>Traditionally, energy efficiency research has focused on reducing energy consumption at the hardware level and, more recently, in the design and coding phases of the software development life cycle. However, software testing’s impact on energy consumption did not receive attention from the research community. Specifically, how test code design quality and test smell (e.g., sub-optimal design and bad practices in test code) impact energy consumption has not been investigated yet.</div></div><div><h3>Objective:</h3><div>This study aims to examine open-source software projects to analyze the association between test smell and its effects on energy consumption in software testing.</div></div><div><h3>Methods:</h3><div>We conducted a mixed-method empirical analysis from two perspectives; software (data mining in 12 Apache projects) and developers’ views (a survey of 62 software practitioners).</div></div><div><h3>Results:</h3><div>Our findings show that: (1) test smell is associated with energy consumption in software testing. Specifically, the smelly part of a test case consumes more energy compared to the non-smelly part. (2) certain test smells are more energy-hungry than others, (3) refactored test cases tend to consume less energy than their smelly counterparts, and (4) most developers (45<span><math><mtext>%</mtext></math></span> of the survey respondents) lack knowledge about test smells’ impact on energy consumption.</div></div><div><h3>Conclusion:</h3><div>Based on the results, we emphasize raising developers awareness regarding the impact of test smells on energy consumption. Additionally we present several observations that can direct future research and developments.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"181 ","pages":"Article 107671"},"PeriodicalIF":3.8,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143301051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A more accurate bug localization technique for bugs with multiple buggy code files
IF 3.8 2区 计算机科学
Information and Software Technology Pub Date : 2025-01-31 DOI: 10.1016/j.infsof.2025.107675
Hui Xu , Zhaodan Wang , Weiqin Zou
{"title":"A more accurate bug localization technique for bugs with multiple buggy code files","authors":"Hui Xu ,&nbsp;Zhaodan Wang ,&nbsp;Weiqin Zou","doi":"10.1016/j.infsof.2025.107675","DOIUrl":"10.1016/j.infsof.2025.107675","url":null,"abstract":"<div><h3>Context:</h3><div>Bug localization is a key step in bug fixing. Despite considerable progress, existing bug localization techniques still perform unsatisfactorily in situations where the complete fix to a bug involves touching multiple buggy code files. That is, for such bugs, those techniques tend to locate correctly only one or at least not all buggy code files, leaving other buggy code files undetected.</div></div><div><h3>Objective:</h3><div>This study aims to improve bug localization in cases where resolving a bug requires modifications to multiple buggy code files by proposing HitMore to rank more truly buggy files higher in the recommendation list.</div></div><div><h3>Method:</h3><div>The basic idea of HitMore is to attempt to retrieve a subset of truly buggy code files first, then use these files to retrieve other buggy code files based on code relation analysis. For the first part, we designed three kinds of domain-specific features to build a machine-learning model to identify the truly buggy code file subset. For the second part, we make use of three types of code relations between the code base and the buggy file subset to better retrieve the remaining truly buggy code files.</div></div><div><h3>Results:</h3><div>The experiments on six widely open-source projects show that: Our technique is effective in identifying the subset of truly buggy code files, with a weighted prediction F1-Score of 86.1%–92.1%. By leveraging the code relations to the retrieved subset and the code base, our HitMore could retrieve all truly buggy code files for 29.31%–69.56% of bugs across six projects. For multiple-buggy-code-file bugs, HitMore could completely localize such bugs by up to 15.38%, 19.36%, and 11.86% more than three representative IRBL baselines across six projects.</div></div><div><h3>Conclusion:</h3><div>The experimental results demonstrate the potential of HitMore in reducing developers’ burden of locating and further fixing relatively complex bugs such as those with multiple buggy code files in practice.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"181 ","pages":"Article 107675"},"PeriodicalIF":3.8,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143301052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond the lab: An in-depth analysis of real-world practices in government-to-citizen software user documentation
IF 3.8 2区 计算机科学
Information and Software Technology Pub Date : 2025-01-30 DOI: 10.1016/j.infsof.2025.107676
Francesco Sovrano, Sandro Vonlanthen, Alberto Bacchelli
{"title":"Beyond the lab: An in-depth analysis of real-world practices in government-to-citizen software user documentation","authors":"Francesco Sovrano,&nbsp;Sandro Vonlanthen,&nbsp;Alberto Bacchelli","doi":"10.1016/j.infsof.2025.107676","DOIUrl":"10.1016/j.infsof.2025.107676","url":null,"abstract":"<div><h3>Context:</h3><div>Governments, including Switzerland through its <em>Digital Switzerland Strategy</em>, are using new technologies to improve public services. However, unclear user guides often lead people to prefer expensive help desk services. Current research on software documentation is limited by small-scale surveys that do not reflect real-world challenges. This paper addresses these gaps by examining the limitations of user guides in a more practical context.</div></div><div><h3>Objective:</h3><div>Building on the identified need for a more comprehensive understanding of user documentation in real-world applications, this study aims to critically analyse user documentation in government-to-citizen (G2C) interactions within Switzerland. We intend to identify both common and critical issues in existing documentation to direct future research towards substantial improvements. By doing so, this research will contribute to the development of more effective user guides, ultimately improving the digital experience for citizens and reducing reliance on costly help desk support.</div></div><div><h3>Methods:</h3><div>Our research methodology involved a thorough analysis of user documentation in German-speaking Swiss cantons. We began with around 5’000 links from official cantonal websites and narrowed it down to nearly 600 user guides relevant to G2C applications. The study progressed in phases: we first assessed the content to identify real-world documentation characteristics, then compared these with common issues from academic research to pinpoint frequent problems. Finally, we analysed the data to identify overarching trends in the documentation characteristics and issues.</div></div><div><h3>Results:</h3><div>Our analyses, which linked guide features to documentation issues, uncovered prevalent real-world issue trends, characterized by significant statistical correlations (<span><math><mrow><mi>p</mi><mo>&lt;</mo><mo>.</mo><mn>05</mn></mrow></math></span>) with the socioeconomic status of the cantons, such as their wealth and population size.</div></div><div><h3>Conclusions:</h3><div>Identifying these trends will help researchers and practitioners concentrate on the most common and critical issues encountered in practice. This, in turn, holds the potential to drive the development of more effective technology for documenting software. <strong>Data and Materials:</strong> <span><span>https://doi.org/10.5281/zenodo.10592871</span><svg><path></path></svg></span></div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"181 ","pages":"Article 107676"},"PeriodicalIF":3.8,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143301050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
XL-HQL: A HQL query generation method via XLNet and column attention
IF 3.8 2区 计算机科学
Information and Software Technology Pub Date : 2025-01-27 DOI: 10.1016/j.infsof.2025.107674
Rongcun Wang , Yiqian Hou , Yuan Tian , Zhanqi Cui , Shujuan Jiang
{"title":"XL-HQL: A HQL query generation method via XLNet and column attention","authors":"Rongcun Wang ,&nbsp;Yiqian Hou ,&nbsp;Yuan Tian ,&nbsp;Zhanqi Cui ,&nbsp;Shujuan Jiang","doi":"10.1016/j.infsof.2025.107674","DOIUrl":"10.1016/j.infsof.2025.107674","url":null,"abstract":"<div><h3>Context:</h3><div>Object-relational mapping (ORM) tools, like Hibernate, are widely used to facilitate the development of database applications by bridging the gap between object-oriented programming (OOP) and relational database management systems (DBMS). These ORM tools simplify the process of mapping OOP objects to relational tables, addressing issues of data inconsistency and performance. However, they also introduce the need to write queries in specific languages, such as Hibernate Query Language (HQL), to manage data interactions within the database.</div></div><div><h3>Objective:</h3><div>These query languages can be difficult to write and error-prone due to the complexities of accurately mapping object models to relational schema with intricate relationships and inheritance hierarchies. To mitigate this issue, a recent study introduced the task of automated HQL query generation, i.e., automatically generating HQL from program context (target method’s signature, properties, and optional method comments and call context). However, the existing solution, HQLgen, has shown limited performance, with an accuracy of 34.52%.</div></div><div><h3>Method:</h3><div>In this paper, we propose a novel HQL query generation approach named XL-HQL. XL-HQL aims to address two main challenges in HQL query generation: limited context information and large search space. Specifically, XL-HQL contains a pre-trained model-based encoder, rules defined to reduce search space, and a column-attention-enabled decoder, which is shown to be effective in SQL generation approaches.</div></div><div><h3>Result:</h3><div>To evaluate the effectiveness of XL-HQL, we designed and conducted experiments on an existing HQL query generation benchmark, which contains 24,118 HQL queries extracted from 3,481 open-source projects. The experimental results show that our approach achieves 66.93% and 64.47% accuracy on mixed and cross-project datasets, respectively, nearly doubling the performance of the state-of-the-art (SOTA) baseline.</div></div><div><h3>Conclusions:</h3><div>The application of pre-trained models that are suitable for handling long sequences for the HQL query generation task shows great potential. Moreover, the defined rules based on OOP knowledge are effective for reducing search space and improving the performance of the task.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"180 ","pages":"Article 107674"},"PeriodicalIF":3.8,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143304415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信