{"title":"Using Metrics for Risk Prediction in Object-Oriented Software: A Cross-Version Validation","authors":"Salim Moudache, M. Badri","doi":"10.17706/jsw.17.1.1-20","DOIUrl":"https://doi.org/10.17706/jsw.17.1.1-20","url":null,"abstract":"This work aims to investigate the potential, from different perspectives, of a risk model to support Cross-Version Fault and Severity Prediction (CVFSP) in object-oriented software. The risk of a class is addressed from the perspective of two particular factors: the number of faults it can contain and their severity. We used various object-oriented metrics to capture the two risk factors. The risk of a class is modeled using the concept of Euclidean distance. We used a dataset collected from five successive versions of an open-source Java software system (ANT). We investigated different variants of the considered risk model, based on various combinations of object-oriented metrics pairs. We used different machine learning algorithms for building the prediction models: Naive Bayes (NB), J48, Random Forest (RF), Support Vector Machines (SVM) and Multilayer Perceptron (ANN). We investigated the effectiveness of the prediction models for Cross-Version Fault and Severity Prediction (CVFSP), using data of prior versions of the considered system. We also investigated if the considered risk model can give as output the Empirical Risk (ER) of a class, a continuous value considering both the number of faults and their different levels of severity. We used different techniques for building the prediction models: Linear Regression (LR), Gaussian Process (GP), Random forest (RF) and M5P (two decision trees algorithms), SmoReg and Artificial Neural Network (ANN). The considered risk model achieves acceptable results for both cross-version binary fault prediction (a g-mean of 0.714, an AUC of 0.725) and cross-version multi-classification of levels of severity (a g-mean of 0.758, an AUC of 0.771). The model also achieves good results in the estimation of the empirical risk of a class by considering both the number of faults and their levels of severity (intra-version analysis with a correlation coefficient of 0.659, cross-version analysis with a correlation coefficient of 0.486).","PeriodicalId":11452,"journal":{"name":"e Informatica Softw. Eng. J.","volume":"1 1","pages":"1-20"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82981160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Systematic Review of Ensemble Techniques for Software Defect and Change Prediction","authors":"Megha Khanna","doi":"10.37190/e-inf220105","DOIUrl":"https://doi.org/10.37190/e-inf220105","url":null,"abstract":"Background: The use of ensemble techniques have steadily gained popularity in several software quality assurance activities. These aggregated classifiers have proven to be superior than their constituent base models. Though ensemble techniques have been widely used in key areas such as Software Defect Prediction (SDP) and Software Change Prediction (SCP), the current state-of-the-art concerning the use of these techniques needs scrutinization. Aim: The study aims to assess, evaluate and uncover possible research gaps with respect to the use of ensemble techniques in SDP and SCP. Method: This study conducts an extensive literature review of 77 primary studies on the basis of the category, application, rules of formulation, performance, and possible threats of the proposed/utilized ensemble techniques. Results: Ensemble techniques were primarily categorized on the basis of similarity, aggregation, relationship, diversity, and dependency of their base models. They were also found effective in several applications such as their use as a learning algorithm for developing SDP/SCP models and for addressing the class imbalance issue. Conclusion: The results of the review ascertain the need of more studies to propose, assess, validate, and compare various categories of ensemble techniques for diverse applications in SDP/SCP such as transfer learning and online learning. evaluating prediction in realistic online scenarios or unavailability of appropriate training data. We investigated the primary studies to ascertain the various applications of ET i.e., what was the underlying use of ET in SDP/SCP. The various applications are listed as under along with the percentage of primary studies that utilized the ET for the particular application.","PeriodicalId":11452,"journal":{"name":"e Informatica Softw. Eng. J.","volume":"98 1","pages":"220105"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86069614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huynh Khanh Vi Tran, J. Börstler, Nauman bin Ali, M. Unterkalmsteiner
{"title":"How good are my search strings? Reflections on using an existing review as a quasi-gold standard","authors":"Huynh Khanh Vi Tran, J. Börstler, Nauman bin Ali, M. Unterkalmsteiner","doi":"10.37190/e-inf220103","DOIUrl":"https://doi.org/10.37190/e-inf220103","url":null,"abstract":"Background: Systematic literature studies (SLS) have become a core research methodology in Evidence-based Software Engineering (EBSE). Search completeness, i.e., finding all relevant papers on the topic of interest, has been recognized as one of the most commonly discussed validity issues of SLSs. Aim: This study aims at raising awareness on the issues related to search string construction and on search validation using a quasi-gold standard (QGS). Furthermore, we aim at providing guidelines for search string validation. Method: We use a recently completed tertiary study as a case and complement our findings with the observations from other researchers studying and advancing EBSE. Results: We found that the issue of assessing QGS quality has not seen much attention in the literature, and the validation of automated searches in SLSs could be improved. Hence, we propose to extend the current search validation approach by the additional analysis step of the automated search validation results and provide recommendations for the QGS construction. Conclusion: In this paper, we report on new issues which could affect search completeness in SLSs. Furthermore, the proposed guideline and recommendations could help researchers implement a more reliable search strategy in their SLSs.","PeriodicalId":11452,"journal":{"name":"e Informatica Softw. Eng. J.","volume":"79 9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87956101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Verifying UML Class Diagram and Formalizing Generalization/Specialization Relationship with Mathematical Set Theory","authors":"Kruti Shah, Emanuel S. Grant","doi":"10.17706/jsw.17.6.292-303","DOIUrl":"https://doi.org/10.17706/jsw.17.6.292-303","url":null,"abstract":"","PeriodicalId":11452,"journal":{"name":"e Informatica Softw. Eng. J.","volume":"10 1","pages":"292-303"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83632735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Examining the Predictive Capability of Advanced Software Fault Prediction Models - An Experimental Investigation Using Combination Metrics","authors":"Pooja Sharma, A. L. Sangal","doi":"10.37190/e-inf220104","DOIUrl":"https://doi.org/10.37190/e-inf220104","url":null,"abstract":"","PeriodicalId":11452,"journal":{"name":"e Informatica Softw. Eng. J.","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88979949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deepika Badampudi, Farnaz Fotrousi, Bruno Cartaxo, Muhammad Usman
{"title":"Reporting Consent, Anonymity and Confidentiality Procedures Adopted in Empirical Studies Using Human Participants","authors":"Deepika Badampudi, Farnaz Fotrousi, Bruno Cartaxo, Muhammad Usman","doi":"10.37190/e-inf220109","DOIUrl":"https://doi.org/10.37190/e-inf220109","url":null,"abstract":"","PeriodicalId":11452,"journal":{"name":"e Informatica Softw. Eng. J.","volume":"32 1","pages":"220109"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75428244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xingru Chen, Deepika Badampudi, Bruno Cartaxo, M. Usman
{"title":"Reuse in Contemporary Software Engineering Practices - An Exploratory Case Study in A Medium-sized Company","authors":"Xingru Chen, Deepika Badampudi, Bruno Cartaxo, M. Usman","doi":"10.37190/e-inf220110","DOIUrl":"https://doi.org/10.37190/e-inf220110","url":null,"abstract":"Background: Software practice is evolving with changing technologies and practices such as InnerSource, DevOps, and microservices. It is important to investigate the impact of contemporary software engineering (SE) practices on software reuse. Aim: This study aims to characterize software reuse in contemporary SE practices and investigate its implications in terms of costs, benefits, challenges, and potential improvements in a medium-sized company. Method: We performed an exploratory case study by conducting interviews, group discussions, and reviewing company documentation to investigate software reuse in the context of contemporary SE practices in the case company. Results: The results indicate that the development for reuse in contemporary SE practices incurs additional coordination, among other costs. Development with reuse led to relatively fewer additional costs and resulted in several benefits such as better product quality and less development and delivery time. Ownership of reusable assets is challenging in contemporary SE practice. InnerSource practices may help mitigate the top perceived challenges: discoverability and ownership of the reusable assets, knowledge sharing and reuse measurement. Conclusion: Reuse in contemporary SE practices is not without additional costs and challenges. However, the practitioners perceive costs as investments that benefit the company in the long run.","PeriodicalId":11452,"journal":{"name":"e Informatica Softw. Eng. J.","volume":"34 1","pages":"220110"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81764048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Comparison of Citation Sources for Reference and Citation-Based Search in Systematic Literature Reviews","authors":"N. Ali, Binish Tanveer","doi":"10.37190/e-inf220106","DOIUrl":"https://doi.org/10.37190/e-inf220106","url":null,"abstract":"","PeriodicalId":11452,"journal":{"name":"e Informatica Softw. Eng. J.","volume":"55 1","pages":"220106"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74322351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intelligent Resume Retrieval Based on Lucence","authors":"J.B. Du, Dongping Ma","doi":"10.17706/jsw.17.1.29-35","DOIUrl":"https://doi.org/10.17706/jsw.17.1.29-35","url":null,"abstract":"With the development of Internet, the electronic resume has gradually replaced the paper one. It is the basic requirement of recruitment for enterprises to retrieve the talent information that fulfills the requirement quickly and without omission.Based on the framework of SpringBoot and Lucence full-text search engine, this paper implements a resume intelligent filtering algorithm, which improves the query speed of the system by establishing an index database. At the same time,the scoring function improves the accuracy of the filtering results, reduces the pressure of high concurrency of the database, improves the work efficiency of the Human Resources Department, and avoids the talent loss.","PeriodicalId":11452,"journal":{"name":"e Informatica Softw. Eng. J.","volume":"5 1","pages":"29-35"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77704361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Framework of Intelligent System for Machine Learning Algorithm Selection in Social Sciences","authors":"D. Oreški","doi":"10.17706/jsw.17.1.21-28","DOIUrl":"https://doi.org/10.17706/jsw.17.1.21-28","url":null,"abstract":"The ability to generate data has never been as powerful as today when three quintile bytes of data are generated daily. In the field of machine learning, a large number of algorithms have been developed, which can be used for intelligent data analysis and to solve prediction and descriptive problems in different domains. Developed algorithms have different effects on different problems.If one algorithmworks better on one dataset,the same algorithm may work worse on another data set. The reason is that each dataset has different features in terms of local and global characteristics. It is therefore imperative to know intrinsic algorithms behavior on different types of datasets andchoose the right algorithm for the problem solving. To address this problem, this papergives scientific contribution in meta learning field by proposing framework for identifying the specific characteristics of datasets in two domains of social sciences:education and business and develops meta models based on: ranking algorithms, calculating correlation of ranks, developing a multi-criteria model, two-component index and prediction based on machine learning algorithms. Each of the meta models serve as the basis for the development of intelligent system version. Application of such framework should include a comparative analysis of a large number of machine learning algorithms on a large number of datasetsfromsocial sciences.","PeriodicalId":11452,"journal":{"name":"e Informatica Softw. Eng. J.","volume":"75 10 1","pages":"21-28"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91030935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}