{"title":"Boosting Identifier Renaming Opportunity Identification via Context-Based Deep Code Representation","authors":"Jingxuan Zhang;Zhuhang Li;Jiahui Liang;Zhiqiu Huang","doi":"10.1109/TR.2025.3535736","DOIUrl":null,"url":null,"abstract":"Source code refactoring brings many benefits to the software being developed, e.g., reduces the likelihood of future development failures and simplifies the implementation of new features. Among the various code refactoring activities, identifier renaming is one of the most frequent software development activities conducted by developers, which plays an important role in program analysis and understanding. However, manually detecting identifier renaming opportunities is time-consuming and labor-intensive. Recently, researchers have proposed several automatic renaming opportunity identification approaches for identifiers. However, existing approaches only focus on one or several specific types of identifiers without generally considering all the types of identifiers. To resolve this problem, we put forward a new approach to detect identifier renaming opportunities by fully exploiting the changes of the programming context and the related code entities. Specifically, we first utilize a siamese network, which employs different attention headers to incorporate the programming context and the related code entities, to derive the semantically meaningful embeddings of identifiers. We then utilize these vectors to train a classifier, which can be used for predicting renaming opportunities for identifiers. Experimental results on 29 255 identifiers from ten Java projects in the Apache community demonstrate that our approach outperforms the state-of-the-art baseline approach by 11.97% as for the average F-Measure in identifying renaming opportunities for all the types of identifiers. In addition, we also verified the effectiveness of some key components of our approach. For instance, utilizing the related code entities into our approach improves the average F-Measure by 6.60%.","PeriodicalId":56305,"journal":{"name":"IEEE Transactions on Reliability","volume":"74 3","pages":"3296-3310"},"PeriodicalIF":5.7000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Reliability","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10892346/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Source code refactoring brings many benefits to the software being developed, e.g., reduces the likelihood of future development failures and simplifies the implementation of new features. Among the various code refactoring activities, identifier renaming is one of the most frequent software development activities conducted by developers, which plays an important role in program analysis and understanding. However, manually detecting identifier renaming opportunities is time-consuming and labor-intensive. Recently, researchers have proposed several automatic renaming opportunity identification approaches for identifiers. However, existing approaches only focus on one or several specific types of identifiers without generally considering all the types of identifiers. To resolve this problem, we put forward a new approach to detect identifier renaming opportunities by fully exploiting the changes of the programming context and the related code entities. Specifically, we first utilize a siamese network, which employs different attention headers to incorporate the programming context and the related code entities, to derive the semantically meaningful embeddings of identifiers. We then utilize these vectors to train a classifier, which can be used for predicting renaming opportunities for identifiers. Experimental results on 29 255 identifiers from ten Java projects in the Apache community demonstrate that our approach outperforms the state-of-the-art baseline approach by 11.97% as for the average F-Measure in identifying renaming opportunities for all the types of identifiers. In addition, we also verified the effectiveness of some key components of our approach. For instance, utilizing the related code entities into our approach improves the average F-Measure by 6.60%.
期刊介绍:
IEEE Transactions on Reliability is a refereed journal for the reliability and allied disciplines including, but not limited to, maintainability, physics of failure, life testing, prognostics, design and manufacture for reliability, reliability for systems of systems, network availability, mission success, warranty, safety, and various measures of effectiveness. Topics eligible for publication range from hardware to software, from materials to systems, from consumer and industrial devices to manufacturing plants, from individual items to networks, from techniques for making things better to ways of predicting and measuring behavior in the field. As an engineering subject that supports new and existing technologies, we constantly expand into new areas of the assurance sciences.