基于AI联合学习的偏斜数据集简易随机森林分类器

IF 0.8 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

International Journal of Pervasive Computing and Communications Pub Date : 2022-08-19 DOI:10.1108/ijpcc-02-2022-0034

A. More, Dipti P Rana

{"title":"基于AI联合学习的偏斜数据集简易随机森林分类器","authors":"A. More, Dipti P Rana","doi":"10.1108/ijpcc-02-2022-0034","DOIUrl":null,"url":null,"abstract":"\nPurpose\nReferred data set produces reliable information about the network flows and common attacks meeting with real-world criteria. Accordingly, this study aims to focus on the use of imbalanced intrusion detection benchmark knowledge discovery in database (KDD) data set. KDD data set is most preferably used by many researchers for experimentation and analysis. The proposed algorithm improvised random forest classification with error tuning factors (IRFCETF) deals with experimentation on KDD data set and evaluates the performance of a complete set of network traffic features through IRFCETF.\n\n\nDesign/methodology/approach\nIn the current era of applications, the attention of researchers is immersed by a diverse number of existing time applications that deals with imbalanced data classification (ImDC). Real-time application areas, artificial intelligence (AI), Industrial Internet of Things (IIoT), etc. are dealing ImDC undergo with diverted classification performance due to skewed data distribution (SkDD). There are numerous application areas that deal with SkDD. Many of the data applications in AI and IIoT face the diverted data classification rate in SkDD. In recent advancements, there is an exponential expansion in the volume of computer network data and related application developments. Intrusion detection is one of the demanding applications of ImDC. The proposed study focusses on imbalanced intrusion benchmark data set, KDD data set and other benchmark data set with the proposed IRFCETF approach. IRFCETF justifies the enriched classification performance on imbalanced data set over the existing approach. The purpose of this work is to review imbalanced data applications in numerous application areas including AI and IIoT and tuning the performance with respect to principal component analysis. This study also focusses on the out-of-bag error performance-tuning factor.\n\n\nFindings\nExperimental results on KDD data set shows that proposed algorithm gives enriched performance. For referred intrusion detection data set, IRFCETF classification accuracy is 99.57% and error rate is 0.43%.\n\n\nResearch limitations/implications\nThis research work extended for further improvements in classification techniques with multiple correspondence analysis (MCA); hierarchical MCA can be focussed with the use of classification models for wide range of skewed data sets.\n\n\nPractical implications\nThe metrics enhancement is measurable and helpful in dealing with intrusion detection systems–related imbalanced applications in current application domains such as security, AI and IIoT digitization. Analytical results show improvised metrics of the proposed approach than other traditional machine learning algorithms. Thus, error-tuning parameter creates a measurable impact on classification accuracy is justified with the proposed IRFCETF.\n\n\nSocial implications\nProposed algorithm is useful in numerous IIoT applications such as health care, machinery automation etc.\n\n\nOriginality/value\nThis research work addressed classification metric enhancement approach IRFCETF. The proposed method yields a test set categorization for each case with error reduction mechanism.\n","PeriodicalId":43952,"journal":{"name":"International Journal of Pervasive Computing and Communications","volume":" ","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AI federated learning based improvised random Forest classifier with error reduction mechanism for skewed data sets\",\"authors\":\"A. More, Dipti P Rana\",\"doi\":\"10.1108/ijpcc-02-2022-0034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\nPurpose\\nReferred data set produces reliable information about the network flows and common attacks meeting with real-world criteria. Accordingly, this study aims to focus on the use of imbalanced intrusion detection benchmark knowledge discovery in database (KDD) data set. KDD data set is most preferably used by many researchers for experimentation and analysis. The proposed algorithm improvised random forest classification with error tuning factors (IRFCETF) deals with experimentation on KDD data set and evaluates the performance of a complete set of network traffic features through IRFCETF.\\n\\n\\nDesign/methodology/approach\\nIn the current era of applications, the attention of researchers is immersed by a diverse number of existing time applications that deals with imbalanced data classification (ImDC). Real-time application areas, artificial intelligence (AI), Industrial Internet of Things (IIoT), etc. are dealing ImDC undergo with diverted classification performance due to skewed data distribution (SkDD). There are numerous application areas that deal with SkDD. Many of the data applications in AI and IIoT face the diverted data classification rate in SkDD. In recent advancements, there is an exponential expansion in the volume of computer network data and related application developments. Intrusion detection is one of the demanding applications of ImDC. The proposed study focusses on imbalanced intrusion benchmark data set, KDD data set and other benchmark data set with the proposed IRFCETF approach. IRFCETF justifies the enriched classification performance on imbalanced data set over the existing approach. The purpose of this work is to review imbalanced data applications in numerous application areas including AI and IIoT and tuning the performance with respect to principal component analysis. This study also focusses on the out-of-bag error performance-tuning factor.\\n\\n\\nFindings\\nExperimental results on KDD data set shows that proposed algorithm gives enriched performance. For referred intrusion detection data set, IRFCETF classification accuracy is 99.57% and error rate is 0.43%.\\n\\n\\nResearch limitations/implications\\nThis research work extended for further improvements in classification techniques with multiple correspondence analysis (MCA); hierarchical MCA can be focussed with the use of classification models for wide range of skewed data sets.\\n\\n\\nPractical implications\\nThe metrics enhancement is measurable and helpful in dealing with intrusion detection systems–related imbalanced applications in current application domains such as security, AI and IIoT digitization. Analytical results show improvised metrics of the proposed approach than other traditional machine learning algorithms. Thus, error-tuning parameter creates a measurable impact on classification accuracy is justified with the proposed IRFCETF.\\n\\n\\nSocial implications\\nProposed algorithm is useful in numerous IIoT applications such as health care, machinery automation etc.\\n\\n\\nOriginality/value\\nThis research work addressed classification metric enhancement approach IRFCETF. The proposed method yields a test set categorization for each case with error reduction mechanism.\\n\",\"PeriodicalId\":43952,\"journal\":{\"name\":\"International Journal of Pervasive Computing and Communications\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2022-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Pervasive Computing and Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1108/ijpcc-02-2022-0034\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Pervasive Computing and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/ijpcc-02-2022-0034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

PurposeReferred数据集生成有关网络流和常见攻击的可靠信息，这些信息符合真实世界的标准。因此，本研究旨在关注数据库（KDD）数据集中不平衡入侵检测基准知识发现的使用。KDD数据集最适合被许多研究人员用于实验和分析。所提出的具有误差调整因子的简易随机森林分类算法（IRFCETF）在KDD数据集上进行了实验，并通过IRFCETF.设计/方法/方法评估了一整套网络流量特征的性能。在当前的应用时代，研究人员的注意力被处理不平衡数据分类（ImDC）的各种现有时间应用程序所吸引。实时应用领域，人工智能（AI）、工业物联网（IIoT）等正在处理ImDC由于数据分布偏斜（SkDD）而导致的分类性能转移的问题。有许多应用领域涉及SkDD。人工智能和IIoT中的许多数据应用都面临着SkDD中转移的数据分类率。近年来，计算机网络数据量和相关应用程序的发展呈指数级增长。入侵检测是ImDC要求很高的应用之一。本研究采用所提出的IRFCETF方法，重点研究了不平衡入侵基准数据集、KDD数据集和其他基准数据集。与现有方法相比，IRFCETF证明了在不平衡数据集上丰富的分类性能。这项工作的目的是审查包括人工智能和IIoT在内的许多应用领域中的不平衡数据应用，并调整主成分分析的性能。本研究还重点研究了袋外误差性能调谐因子。在KDD数据集上的实验结果表明，该算法具有丰富的性能。对于参考的入侵检测数据集，IRFCETF分类准确率为99.57%，错误率为0.43%。研究局限性/含义本研究工作扩展了多重对应分析（MCA）分类技术的进一步改进；分层MCA可以通过对广泛的偏斜数据集使用分类模型来集中。实际意义度量增强是可测量的，有助于处理当前应用领域中与入侵检测系统相关的不平衡应用，如安全、人工智能和IIoT数字化。分析结果表明，与其他传统的机器学习算法相比，所提出的方法具有即兴的度量。因此，所提出的IRFCETF证明了误差调整参数对分类准确性产生的可衡量的影响。社会含义所提出的算法在许多IIoT应用中都很有用，如医疗保健、机械自动化等。原始性/价值这项研究工作涉及分类度量增强方法IRFCETF。所提出的方法为每种情况产生一个测试集分类，并具有减少错误的机制。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

AI federated learning based improvised random Forest classifier with error reduction mechanism for skewed data sets

Purpose Referred data set produces reliable information about the network flows and common attacks meeting with real-world criteria. Accordingly, this study aims to focus on the use of imbalanced intrusion detection benchmark knowledge discovery in database (KDD) data set. KDD data set is most preferably used by many researchers for experimentation and analysis. The proposed algorithm improvised random forest classification with error tuning factors (IRFCETF) deals with experimentation on KDD data set and evaluates the performance of a complete set of network traffic features through IRFCETF. Design/methodology/approach In the current era of applications, the attention of researchers is immersed by a diverse number of existing time applications that deals with imbalanced data classification (ImDC). Real-time application areas, artificial intelligence (AI), Industrial Internet of Things (IIoT), etc. are dealing ImDC undergo with diverted classification performance due to skewed data distribution (SkDD). There are numerous application areas that deal with SkDD. Many of the data applications in AI and IIoT face the diverted data classification rate in SkDD. In recent advancements, there is an exponential expansion in the volume of computer network data and related application developments. Intrusion detection is one of the demanding applications of ImDC. The proposed study focusses on imbalanced intrusion benchmark data set, KDD data set and other benchmark data set with the proposed IRFCETF approach. IRFCETF justifies the enriched classification performance on imbalanced data set over the existing approach. The purpose of this work is to review imbalanced data applications in numerous application areas including AI and IIoT and tuning the performance with respect to principal component analysis. This study also focusses on the out-of-bag error performance-tuning factor. Findings Experimental results on KDD data set shows that proposed algorithm gives enriched performance. For referred intrusion detection data set, IRFCETF classification accuracy is 99.57% and error rate is 0.43%. Research limitations/implications This research work extended for further improvements in classification techniques with multiple correspondence analysis (MCA); hierarchical MCA can be focussed with the use of classification models for wide range of skewed data sets. Practical implications The metrics enhancement is measurable and helpful in dealing with intrusion detection systems–related imbalanced applications in current application domains such as security, AI and IIoT digitization. Analytical results show improvised metrics of the proposed approach than other traditional machine learning algorithms. Thus, error-tuning parameter creates a measurable impact on classification accuracy is justified with the proposed IRFCETF. Social implications Proposed algorithm is useful in numerous IIoT applications such as health care, machinery automation etc. Originality/value This research work addressed classification metric enhancement approach IRFCETF. The proposed method yields a test set categorization for each case with error reduction mechanism.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Pervasive Computing and Communications COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-

CiteScore

6.60

自引率

0.00%

发文量