Classifying user-created passwords using machine learning and natural language processing techniques

IF 7.6 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Internet of Things Pub Date : 2026-03-01 Epub Date: 2025-12-13 DOI:10.1016/j.iot.2025.101854
Binh Le Thanh Thai, Tsubasa Takii, Hidema Tanaka
{"title":"Classifying user-created passwords using machine learning and natural language processing techniques","authors":"Binh Le Thanh Thai,&nbsp;Tsubasa Takii,&nbsp;Hidema Tanaka","doi":"10.1016/j.iot.2025.101854","DOIUrl":null,"url":null,"abstract":"<div><div>Passwords are the dominant authentication method. However, evaluating the strength of user-created passwords remains a significant challenge due to the influence of various external factors, such as language, culture, and keyboard layout. In this paper, we address the problem of classifying user-created passwords into predefined groups, rather than directly evaluating their strength. First, we assess the performance of classifiers utilizing eight machine learning (ML) algorithms and four Natural Language Processing techniques to identify the optimal combination of ML algorithms and feature extraction methods. Through this experiment, we determine that the classifier combining Bag-of-Words and Logistic Regression is the most effective approach for classifying user-created passwords. Subsequently, we propose a hierarchical classification model to enhance the performance of this classifier. Experimental results demonstrate that the proposed model achieves accuracy of 97.81 % and recall of 99.66 % for weak passwords.</div></div>","PeriodicalId":29968,"journal":{"name":"Internet of Things","volume":"36 ","pages":"Article 101854"},"PeriodicalIF":7.6000,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet of Things","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2542660525003683","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/12/13 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Passwords are the dominant authentication method. However, evaluating the strength of user-created passwords remains a significant challenge due to the influence of various external factors, such as language, culture, and keyboard layout. In this paper, we address the problem of classifying user-created passwords into predefined groups, rather than directly evaluating their strength. First, we assess the performance of classifiers utilizing eight machine learning (ML) algorithms and four Natural Language Processing techniques to identify the optimal combination of ML algorithms and feature extraction methods. Through this experiment, we determine that the classifier combining Bag-of-Words and Logistic Regression is the most effective approach for classifying user-created passwords. Subsequently, we propose a hierarchical classification model to enhance the performance of this classifier. Experimental results demonstrate that the proposed model achieves accuracy of 97.81 % and recall of 99.66 % for weak passwords.
使用机器学习和自然语言处理技术对用户创建的密码进行分类
密码是主要的认证方法。然而,由于语言、文化和键盘布局等各种外部因素的影响,评估用户创建的密码的强度仍然是一个重大挑战。在本文中,我们解决了将用户创建的密码分类到预定义组的问题,而不是直接评估它们的强度。首先,我们评估了使用八种机器学习(ML)算法和四种自然语言处理技术的分类器的性能,以确定ML算法和特征提取方法的最佳组合。通过本实验,我们确定结合词袋和逻辑回归的分类器是对用户创建的密码进行分类的最有效方法。随后,我们提出了一种层次分类模型来提高该分类器的性能。实验结果表明,该模型对弱密码的识别率为97.81%,召回率为99.66%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Internet of Things
Internet of Things Multiple-
CiteScore
3.60
自引率
5.10%
发文量
115
审稿时长
37 days
期刊介绍: Internet of Things; Engineering Cyber Physical Human Systems is a comprehensive journal encouraging cross collaboration between researchers, engineers and practitioners in the field of IoT & Cyber Physical Human Systems. The journal offers a unique platform to exchange scientific information on the entire breadth of technology, science, and societal applications of the IoT. The journal will place a high priority on timely publication, and provide a home for high quality. Furthermore, IOT is interested in publishing topical Special Issues on any aspect of IOT.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书