Multi-datasets for different keyboard key sound recognition

IF 1 Q3 MULTIDISCIPLINARY SCIENCES

Data in Brief Pub Date : 2024-09-19 DOI:10.1016/j.dib.2024.110949

Karwan M. Hama Rawf, Ayub O. Abdulrahman, Hana O. Kamel, Lawen M. Hassan, Ahmad O. Ali

{"title":"Multi-datasets for different keyboard key sound recognition","authors":"Karwan M. Hama Rawf, Ayub O. Abdulrahman, Hana O. Kamel, Lawen M. Hassan, Ahmad O. Ali","doi":"10.1016/j.dib.2024.110949","DOIUrl":null,"url":null,"abstract":"<div><div>Keyboard acoustic recognition is a pivotal area within cybersecurity and human-computer interaction, where the identification and analysis of keyboard sounds are used to enhance security measures. The performance of acoustic-based security systems can be influenced by factors such as the platform used, typing style, and environmental noise. To address these variations and provide a comprehensive resource, we present the Multi-Keyboard Acoustic (MKA) Datasets. These extensive datasets, meticulously gathered by a team in the Computer Science Department at the University of Halabja, include recordings from six widely-used platforms: HP, Lenovo, MSI, Mac, Messenger, and Zoom. The MKA datasets have structured data for each platform, including raw recordings, segmented sound files, and matrices derived from these sounds. They can be used by researchers in keylogging detection, cybersecurity, and other fields related to acoustic emanation attacks on keyboards. Moreover, the datasets capture the intricacies of typing behaviour with both hands and all ten fingers by carefully segmenting and pre-processing the data using the Praat tool, thus ensuring high-quality and dependable data. This comprehensive approach allows researchers to explore various aspects of keyboard sound recognition, contributing to the development of robust recognition algorithms and enhanced security measures. The MKA Datasets stand as one of the largest and most detailed datasets in this domain, offering significant potential for advancing research and improving defences against acoustic-based threats.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"Article 110949"},"PeriodicalIF":1.0000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2352340924009090/pdfft?md5=946e747027631a229faaaa7cdf2abc37&pid=1-s2.0-S2352340924009090-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340924009090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Keyboard acoustic recognition is a pivotal area within cybersecurity and human-computer interaction, where the identification and analysis of keyboard sounds are used to enhance security measures. The performance of acoustic-based security systems can be influenced by factors such as the platform used, typing style, and environmental noise. To address these variations and provide a comprehensive resource, we present the Multi-Keyboard Acoustic (MKA) Datasets. These extensive datasets, meticulously gathered by a team in the Computer Science Department at the University of Halabja, include recordings from six widely-used platforms: HP, Lenovo, MSI, Mac, Messenger, and Zoom. The MKA datasets have structured data for each platform, including raw recordings, segmented sound files, and matrices derived from these sounds. They can be used by researchers in keylogging detection, cybersecurity, and other fields related to acoustic emanation attacks on keyboards. Moreover, the datasets capture the intricacies of typing behaviour with both hands and all ten fingers by carefully segmenting and pre-processing the data using the Praat tool, thus ensuring high-quality and dependable data. This comprehensive approach allows researchers to explore various aspects of keyboard sound recognition, contributing to the development of robust recognition algorithms and enhanced security measures. The MKA Datasets stand as one of the largest and most detailed datasets in this domain, offering significant potential for advancing research and improving defences against acoustic-based threats.

查看原文本刊更多论文

多数据集可识别不同的键盘按键声音

键盘声音识别是网络安全和人机交互的一个关键领域，键盘声音的识别和分析可用于加强安全措施。声学安全系统的性能会受到使用平台、打字风格和环境噪声等因素的影响。为了应对这些变化并提供全面的资源，我们推出了多键盘声学 (MKA) 数据集。这些广泛的数据集由哈拉布贾大学计算机科学系的一个团队精心收集，包括来自六种广泛使用的平台的录音：惠普、联想、微星、Mac、Messenger 和 Zoom。MKA 数据集包含每个平台的结构化数据，包括原始录音、分段声音文件以及从这些声音中得出的矩阵。这些数据集可供键盘记录检测、网络安全和其他与键盘声发射攻击相关领域的研究人员使用。此外，数据集通过使用 Praat 工具对数据进行仔细分割和预处理，捕捉到了双手十指打字行为的复杂性，从而确保了数据的高质量和可靠性。这种全面的方法使研究人员能够探索键盘声音识别的各个方面，有助于开发强大的识别算法和增强安全措施。MKA 数据集是该领域最大、最详细的数据集之一，为推进研究和改进声学威胁防御提供了巨大潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Data in Brief MULTIDISCIPLINARY SCIENCES-

CiteScore

3.10

自引率

0.00%

发文量

996

审稿时长

70 days

期刊介绍： Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.