A predictive contrivance for recognising traits in keystroke dynamics

Advances in computational intelligence Pub Date : 2025-05-29 DOI:10.1007/s43674-025-00081-1

Soumen Roy, Utpal Roy, Devadatta Sinha, Rajat Kumar Pal

{"title":"A predictive contrivance for recognising traits in keystroke dynamics","authors":"Soumen Roy, Utpal Roy, Devadatta Sinha, Rajat Kumar Pal","doi":"10.1007/s43674-025-00081-1","DOIUrl":null,"url":null,"abstract":"<div><p>Predicting personal traits, particularly age group, gender, handedness, and hand(s) used, in the form of digital identity for smartphone users by analysing keystroke dynamics (KD) attributes is a challenging area. However, it has a variety of applications in e-commerce, e-banking, e-teaching/learning, e-exams, forensics, and social networking. The main bottleneck of this problem is addressing the imbalanced nature of KD datasets using conventional machine learning (ML) approaches. By their inherent nature, KD datasets are often imbalanced from various perspectives due to the non-uniformity of diverse user traits and their varied usage patterns. This study proposes a predictive model for both fixed and free-text modes, considering the effect of attached smartphone sensors. We adopt a score-level fusion of eXtreme Gradient Boosting (XGBoost) models on several balanced bootstrapped training samples to address the limitations of conventional approaches. This ensemble approach utilizes multiple bootstrapped training sets, where the class distribution in each set is equally balanced for more accurate and robust performance. Furthermore, we observe the positive impact of incorporating these prediction scores and labels with primary biometric attributes in KD-based user authentication and identification, both in static/entry-point and continuous/active security designs—a previously unanswered challenges. The predictive mechanism and its adaptation in unique KD-based designs, based on datasets collected from a considerable number of volunteers with diverse age groups, genders, professions, and education levels through a smartphone in a web environment, demonstrate the novelty of our approach.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"5 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in computational intelligence","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s43674-025-00081-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Predicting personal traits, particularly age group, gender, handedness, and hand(s) used, in the form of digital identity for smartphone users by analysing keystroke dynamics (KD) attributes is a challenging area. However, it has a variety of applications in e-commerce, e-banking, e-teaching/learning, e-exams, forensics, and social networking. The main bottleneck of this problem is addressing the imbalanced nature of KD datasets using conventional machine learning (ML) approaches. By their inherent nature, KD datasets are often imbalanced from various perspectives due to the non-uniformity of diverse user traits and their varied usage patterns. This study proposes a predictive model for both fixed and free-text modes, considering the effect of attached smartphone sensors. We adopt a score-level fusion of eXtreme Gradient Boosting (XGBoost) models on several balanced bootstrapped training samples to address the limitations of conventional approaches. This ensemble approach utilizes multiple bootstrapped training sets, where the class distribution in each set is equally balanced for more accurate and robust performance. Furthermore, we observe the positive impact of incorporating these prediction scores and labels with primary biometric attributes in KD-based user authentication and identification, both in static/entry-point and continuous/active security designs—a previously unanswered challenges. The predictive mechanism and its adaptation in unique KD-based designs, based on datasets collected from a considerable number of volunteers with diverse age groups, genders, professions, and education levels through a smartphone in a web environment, demonstrate the novelty of our approach.

查看原文本刊更多论文

一种识别击键动力学特征的预测装置

通过分析击键动力学（KD）属性，以智能手机用户数字身份的形式预测个人特征，特别是年龄组、性别、用手习惯和使用的手，是一个具有挑战性的领域。然而，它在电子商务、电子银行、电子教学/学习、电子考试、取证和社会网络中有各种各样的应用。该问题的主要瓶颈是使用传统的机器学习（ML）方法解决KD数据集的不平衡性。由于其固有的性质，由于不同用户特征的不均匀性及其不同的使用模式，KD数据集往往从不同的角度来看是不平衡的。本研究提出了一个固定和自由文本模式的预测模型，考虑了附加智能手机传感器的影响。我们在几个平衡的自举训练样本上采用极限梯度提升（XGBoost）模型的分数级融合来解决传统方法的局限性。这种集成方法利用多个自举训练集，其中每个集中的类分布均匀平衡，以获得更准确和鲁棒的性能。此外，我们观察到在基于kd的用户身份验证和识别中，将这些预测分数和标签与主要生物特征属性结合起来的积极影响，无论是在静态/入口点还是持续/主动安全设计中，这都是以前未解决的挑战。预测机制及其在独特的基于kd的设计中的适应性，基于通过网络环境中的智能手机从大量不同年龄、性别、职业和教育水平的志愿者中收集的数据集，展示了我们方法的新颖性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Advances in computational intelligence

自引率

0.00%

发文量