Applying Cognitive and Neural Network Approach over Control Flow Graph for Software Defect Prediction

K. Rajnish, Vandana Bhattacharjee, Vishnu Chandrabanshi
{"title":"Applying Cognitive and Neural Network Approach over Control Flow Graph for Software Defect Prediction","authors":"K. Rajnish, Vandana Bhattacharjee, Vishnu Chandrabanshi","doi":"10.1145/3474124.3474127","DOIUrl":null,"url":null,"abstract":"∗Like all other engineering products, prediction of defects in software, plays an important role in the dynamic research areas of software engineering. A defect is an error, bug, flaw, fault, breakdown or mistakes in software that causes it to create an inaccurate or unpredicted outcome. Most of the faults are from source code or design, some of them are from the improper code generating from compilers. The software engineering community is striving for valid measurements to enhance the quality of software. As software ages, the task of maintaining and comprehending them becomes complex and expensive. It has been estimated that 60% of the software maintenance effort is due to the comprehension of the source code. The cognitive informatics plays an important role to quantify the degree of difficulty or the efforts employed by developers to comprehend the source code. In 2003, the cognitive weight has been assigned to each possible basic control structure of software by conducting several empirical studies. These cognitive weights are utilized by several researchers to evaluate the cognitive complexity for software system. In this paper an attempt has been made to classify the Control Flow Graphs (CFGs) node according to their node features and each unique feature value is assigned an integer encoding value which we find the appropriate parameters (or features) of the source code file through cognitive complexity measures and incorporate of cognitive complexity measures outcome as nodes in CFGs and generates same based on the node-connectivity’s for a graph. Vector matrix of graph is then created and apply Graph Convolutional Network (GCN) to get the feature representation of graph. Finally, we developed deep neural network Keras Model (KM) to predict software defects. The framework used is Python Programming Language with Keras and TensorFlow. An analysis is done based on the data collected from PG students of our institute. The approaches are evaluated based on Accuracy, Receiver Operating Characteristics (ROC), known as the Area Under Curve (AUC), F-Measure, and Precision. The experimental results indicated that KM model classifiers outperformed well in all evaluation criteria against state of art methods (Naïve Bayes classifier (NB), Support Vector Machine (SVM) classifier and Random forest (RF) classifier. ∗Place the footnote text for the author (if applicable) here. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. IC3 ’21, August 05–07, 2021, Noida, India © 2021 Association for Computing Machinery. ACM ISBN 978-1-4503-8920-4/21/08. . . $15.00 https://doi.org/10.1145/3474124.3474127 CCS CONCEPTS • Machine Learning; • Software and its Engineering; • General and Reference;","PeriodicalId":144611,"journal":{"name":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Thirteenth International Conference on Contemporary Computing (IC3-2021)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3474124.3474127","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

∗Like all other engineering products, prediction of defects in software, plays an important role in the dynamic research areas of software engineering. A defect is an error, bug, flaw, fault, breakdown or mistakes in software that causes it to create an inaccurate or unpredicted outcome. Most of the faults are from source code or design, some of them are from the improper code generating from compilers. The software engineering community is striving for valid measurements to enhance the quality of software. As software ages, the task of maintaining and comprehending them becomes complex and expensive. It has been estimated that 60% of the software maintenance effort is due to the comprehension of the source code. The cognitive informatics plays an important role to quantify the degree of difficulty or the efforts employed by developers to comprehend the source code. In 2003, the cognitive weight has been assigned to each possible basic control structure of software by conducting several empirical studies. These cognitive weights are utilized by several researchers to evaluate the cognitive complexity for software system. In this paper an attempt has been made to classify the Control Flow Graphs (CFGs) node according to their node features and each unique feature value is assigned an integer encoding value which we find the appropriate parameters (or features) of the source code file through cognitive complexity measures and incorporate of cognitive complexity measures outcome as nodes in CFGs and generates same based on the node-connectivity’s for a graph. Vector matrix of graph is then created and apply Graph Convolutional Network (GCN) to get the feature representation of graph. Finally, we developed deep neural network Keras Model (KM) to predict software defects. The framework used is Python Programming Language with Keras and TensorFlow. An analysis is done based on the data collected from PG students of our institute. The approaches are evaluated based on Accuracy, Receiver Operating Characteristics (ROC), known as the Area Under Curve (AUC), F-Measure, and Precision. The experimental results indicated that KM model classifiers outperformed well in all evaluation criteria against state of art methods (Naïve Bayes classifier (NB), Support Vector Machine (SVM) classifier and Random forest (RF) classifier. ∗Place the footnote text for the author (if applicable) here. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. IC3 ’21, August 05–07, 2021, Noida, India © 2021 Association for Computing Machinery. ACM ISBN 978-1-4503-8920-4/21/08. . . $15.00 https://doi.org/10.1145/3474124.3474127 CCS CONCEPTS • Machine Learning; • Software and its Engineering; • General and Reference;
在控制流图上应用认知和神经网络方法进行软件缺陷预测
∗与所有其他工程产品一样,软件缺陷预测在软件工程的动态研究领域发挥着重要作用。缺陷是指软件中的错误、缺陷、瑕疵、故障、崩溃或失误,它们会导致软件产生不准确或无法预测的结果。大多数缺陷来自源代码或设计,也有一些缺陷来自编译器生成的不当代码。软件工程界正在努力寻求有效的测量方法来提高软件质量。随着软件的老化,维护和理解软件的任务变得复杂而昂贵。据估计,60% 的软件维护工作都是源代码的理解工作。认知信息学在量化开发人员理解源代码的难度或工作量方面发挥着重要作用。2003 年,通过开展多项实证研究,为软件的每一种可能的基本控制结构分配了认知权重。一些研究人员利用这些认知权重来评估软件系统的认知复杂度。本文尝试根据节点特征对控制流图(CFG)节点进行分类,并为每个独特的特征值分配一个整数编码值,我们通过认知复杂性度量找到源代码文件的适当参数(或特征),并将认知复杂性度量结果作为 CFG 节点纳入其中,然后根据图的节点连接性生成相同的结果。然后创建图的向量矩阵,并应用图卷积网络(GCN)获得图的特征表示。最后,我们开发了深度神经网络 Keras 模型(KM)来预测软件缺陷。使用的框架是带有 Keras 和 TensorFlow 的 Python 编程语言。我们根据从本学院 PG 学生那里收集的数据进行了分析。评估方法基于准确度、接收器操作特征(ROC),即曲线下面积(AUC)、F-度量和精确度。实验结果表明,与最先进的方法(奈夫贝叶斯分类器(NB)、支持向量机(SVM)分类器和随机森林(RF)分类器)相比,KM 模型分类器在所有评估标准方面都表现出色。∗请在此处为作者添加脚注(如适用)。允许将本作品的全部或部分内容制作成数字拷贝或打印拷贝,供个人或课堂使用,不收取任何费用,但不得以营利或商业利益为目的制作或分发拷贝,且必须在拷贝的第一页注明本声明和完整的引文。除 ACM 外,本著作其他部分的版权必须得到尊重。允许摘录并注明出处。如需复制、再版、在服务器上发布或在列表中重新发布,需事先获得特别许可和/或付费。请向 permissions@acm.org 申请许可。IC3 '21, August 05-07, 2021, Noida, India © 2021 Association for Computing Machinery.ACM ISBN 978-1-4503-8920-4/21/08.. $15.00 https://doi.org/10.1145/3474124.3474127 CCS CONCEPTS - Machine Learning; - Software and its Engineering; - General and Reference;
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信