{"title":"稳健的密码安全:处理不平衡数据集的遗传编程方法","authors":"Nikola Andelić, Sandi Baressi S̆egota, Zlatan Car","doi":"10.1007/s10207-024-00814-2","DOIUrl":null,"url":null,"abstract":"<p>Developing a method for determining password strength using artificial intelligence (AI) is crucial as it enhances cybersecurity by providing a more robust defense against unauthorized access. AI can analyze complex patterns and trends, allowing for the identification of weak passwords and potential vulnerabilities more effectively than traditional methods. This proactive approach helps users and organizations strengthen their security posture, reducing the risk of data breaches and unauthorized intrusions. In this paper, the genetic programming symbolic classifier (GPSC) was applied to the publicly available dataset to obtain a set of symbolic expressions for password strength classification with high classification accuracy. One of the problems with the dataset was an imbalance between classes so various oversampling/undersampling techniques have been utilized. The optimal GPSC hyperparameter values were found using the random hyperparameter value search method. The algorithm was trained using fivefold cross-validation (5FCV). One of the problems with the dataset was an imbalance between classes so various oversampling/undersampling techniques have been utilized. To evaluate obtained SEs, the evaluation metric accuracy, area under receiver operating characteristics curve, precision, recall, and <i>f</i>1-score were used. The obtained SEs on balanced dataset variations achieved high classification accuracy (0.99) and with the application of all SEs on the entire original imbalanced dataset achieved the same accuracy.</p>","PeriodicalId":50316,"journal":{"name":"International Journal of Information Security","volume":"9 2","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust password security: a genetic programming approach with imbalanced dataset handling\",\"authors\":\"Nikola Andelić, Sandi Baressi S̆egota, Zlatan Car\",\"doi\":\"10.1007/s10207-024-00814-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Developing a method for determining password strength using artificial intelligence (AI) is crucial as it enhances cybersecurity by providing a more robust defense against unauthorized access. AI can analyze complex patterns and trends, allowing for the identification of weak passwords and potential vulnerabilities more effectively than traditional methods. This proactive approach helps users and organizations strengthen their security posture, reducing the risk of data breaches and unauthorized intrusions. In this paper, the genetic programming symbolic classifier (GPSC) was applied to the publicly available dataset to obtain a set of symbolic expressions for password strength classification with high classification accuracy. One of the problems with the dataset was an imbalance between classes so various oversampling/undersampling techniques have been utilized. The optimal GPSC hyperparameter values were found using the random hyperparameter value search method. The algorithm was trained using fivefold cross-validation (5FCV). One of the problems with the dataset was an imbalance between classes so various oversampling/undersampling techniques have been utilized. To evaluate obtained SEs, the evaluation metric accuracy, area under receiver operating characteristics curve, precision, recall, and <i>f</i>1-score were used. The obtained SEs on balanced dataset variations achieved high classification accuracy (0.99) and with the application of all SEs on the entire original imbalanced dataset achieved the same accuracy.</p>\",\"PeriodicalId\":50316,\"journal\":{\"name\":\"International Journal of Information Security\",\"volume\":\"9 2\",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2024-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Information Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10207-024-00814-2\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Information Security","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10207-024-00814-2","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
开发一种使用人工智能(AI)确定密码强度的方法至关重要,因为它能提供更强大的防御功能,防止未经授权的访问,从而增强网络安全。与传统方法相比,人工智能可以分析复杂的模式和趋势,从而更有效地识别弱密码和潜在漏洞。这种积极主动的方法有助于用户和组织加强安全态势,降低数据泄露和未经授权入侵的风险。本文将遗传编程符号分类器(GPSC)应用于公开可用的数据集,获得了一组用于密码强度分类的符号表达式,分类准确率很高。该数据集的问题之一是类间不平衡,因此采用了各种超采样/去采样技术。使用随机超参数值搜索法找到了最佳 GPSC 超参数值。该算法使用五重交叉验证(5FCV)进行训练。数据集的问题之一是类之间的不平衡,因此使用了各种超采样/反采样技术。为了评估所获得的 SE,使用了准确度、接收器工作特性曲线下面积、精确度、召回率和 f1 分数等评估指标。在平衡数据集变化上获得的 SE 达到了很高的分类准确率(0.99),而在整个原始不平衡数据集上应用所有 SE 时,也达到了相同的准确率。
Robust password security: a genetic programming approach with imbalanced dataset handling
Developing a method for determining password strength using artificial intelligence (AI) is crucial as it enhances cybersecurity by providing a more robust defense against unauthorized access. AI can analyze complex patterns and trends, allowing for the identification of weak passwords and potential vulnerabilities more effectively than traditional methods. This proactive approach helps users and organizations strengthen their security posture, reducing the risk of data breaches and unauthorized intrusions. In this paper, the genetic programming symbolic classifier (GPSC) was applied to the publicly available dataset to obtain a set of symbolic expressions for password strength classification with high classification accuracy. One of the problems with the dataset was an imbalance between classes so various oversampling/undersampling techniques have been utilized. The optimal GPSC hyperparameter values were found using the random hyperparameter value search method. The algorithm was trained using fivefold cross-validation (5FCV). One of the problems with the dataset was an imbalance between classes so various oversampling/undersampling techniques have been utilized. To evaluate obtained SEs, the evaluation metric accuracy, area under receiver operating characteristics curve, precision, recall, and f1-score were used. The obtained SEs on balanced dataset variations achieved high classification accuracy (0.99) and with the application of all SEs on the entire original imbalanced dataset achieved the same accuracy.
期刊介绍:
The International Journal of Information Security is an English language periodical on research in information security which offers prompt publication of important technical work, whether theoretical, applicable, or related to implementation.
Coverage includes system security: intrusion detection, secure end systems, secure operating systems, database security, security infrastructures, security evaluation; network security: Internet security, firewalls, mobile security, security agents, protocols, anti-virus and anti-hacker measures; content protection: watermarking, software protection, tamper resistant software; applications: electronic commerce, government, health, telecommunications, mobility.