Amjad Hussain , Ayesha Saadia , Faeiz M. Alserhani
{"title":"使用微调BERT和RoBERTa模型的勒索软件检测和家族分类","authors":"Amjad Hussain , Ayesha Saadia , Faeiz M. Alserhani","doi":"10.1016/j.eij.2025.100645","DOIUrl":null,"url":null,"abstract":"<div><div>Integrating Internet of Things (IoT) technologies in healthcare has revolutionized patient care, enabling real-time monitoring, predictive analytics, and personalized treatments. However, it presents significant challenges that must be addressed to ensure secure and reliable implementation. IoT devices in healthcare, such as remote patient monitors, are often constrained by limited computational power, making them vulnerable to sophisticated cyberattacks, including ransomware. In 2017 the WannaCry ransomware attack disrupted many National Health Service facilities in the United Kingdom and emphasized the critical need for robust cybersecurity measures. The lack of standardization across IoT devices creates interoperability issues and complicates data transfer between medical devices and healthcare systems. This research explores these challenges and proposes a novel approach using hyperparameter-optimized transfer learning-based models, Bidirectional Encoder Representations from Transformers (BERT), and a Robustly Optimized BERT Approach (RoBERTa), to not only detect but also classify ransomware targeting IoT devices by analyzing dynamically executed API call sequences in a sandbox environment. A total of 3300 samples from 10 ransomware families including 300 benign cases are analyzed dynamically in a sandbox environment. The newly created dataset is then preprocessed and fed to the BERT and RoBERTa models for training. The BERT achieved 95.60% accuracy with a minimal loss of 0.1650 while the RoBERTa achieved 94.39% accuracy with 0.1948 loss in classifying ransomware families. These results indicate that the proposed approach is game-changing in the classification of previously unidentified behavioral patterns inside ransomware and enhances the ability to tackle newly developing threats. By leveraging the dynamic analysis with API call sequences in a correct format, and training hyperparameter-optimized transformer learning-based models, the methodology efficiently captures behavioral patterns unique to ransomware. The research provides a scalable framework for integrating advanced detection mechanisms into real-world healthcare IoT systems, enhancing their resilience against cyber threats.</div></div>","PeriodicalId":56010,"journal":{"name":"Egyptian Informatics Journal","volume":"30 ","pages":"Article 100645"},"PeriodicalIF":4.3000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Ransomware detection and family classification using fine-tuned BERT and RoBERTa models\",\"authors\":\"Amjad Hussain , Ayesha Saadia , Faeiz M. Alserhani\",\"doi\":\"10.1016/j.eij.2025.100645\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Integrating Internet of Things (IoT) technologies in healthcare has revolutionized patient care, enabling real-time monitoring, predictive analytics, and personalized treatments. However, it presents significant challenges that must be addressed to ensure secure and reliable implementation. IoT devices in healthcare, such as remote patient monitors, are often constrained by limited computational power, making them vulnerable to sophisticated cyberattacks, including ransomware. In 2017 the WannaCry ransomware attack disrupted many National Health Service facilities in the United Kingdom and emphasized the critical need for robust cybersecurity measures. The lack of standardization across IoT devices creates interoperability issues and complicates data transfer between medical devices and healthcare systems. This research explores these challenges and proposes a novel approach using hyperparameter-optimized transfer learning-based models, Bidirectional Encoder Representations from Transformers (BERT), and a Robustly Optimized BERT Approach (RoBERTa), to not only detect but also classify ransomware targeting IoT devices by analyzing dynamically executed API call sequences in a sandbox environment. A total of 3300 samples from 10 ransomware families including 300 benign cases are analyzed dynamically in a sandbox environment. The newly created dataset is then preprocessed and fed to the BERT and RoBERTa models for training. The BERT achieved 95.60% accuracy with a minimal loss of 0.1650 while the RoBERTa achieved 94.39% accuracy with 0.1948 loss in classifying ransomware families. These results indicate that the proposed approach is game-changing in the classification of previously unidentified behavioral patterns inside ransomware and enhances the ability to tackle newly developing threats. By leveraging the dynamic analysis with API call sequences in a correct format, and training hyperparameter-optimized transformer learning-based models, the methodology efficiently captures behavioral patterns unique to ransomware. The research provides a scalable framework for integrating advanced detection mechanisms into real-world healthcare IoT systems, enhancing their resilience against cyber threats.</div></div>\",\"PeriodicalId\":56010,\"journal\":{\"name\":\"Egyptian Informatics Journal\",\"volume\":\"30 \",\"pages\":\"Article 100645\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-05-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Egyptian Informatics Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1110866525000386\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Informatics Journal","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110866525000386","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Ransomware detection and family classification using fine-tuned BERT and RoBERTa models
Integrating Internet of Things (IoT) technologies in healthcare has revolutionized patient care, enabling real-time monitoring, predictive analytics, and personalized treatments. However, it presents significant challenges that must be addressed to ensure secure and reliable implementation. IoT devices in healthcare, such as remote patient monitors, are often constrained by limited computational power, making them vulnerable to sophisticated cyberattacks, including ransomware. In 2017 the WannaCry ransomware attack disrupted many National Health Service facilities in the United Kingdom and emphasized the critical need for robust cybersecurity measures. The lack of standardization across IoT devices creates interoperability issues and complicates data transfer between medical devices and healthcare systems. This research explores these challenges and proposes a novel approach using hyperparameter-optimized transfer learning-based models, Bidirectional Encoder Representations from Transformers (BERT), and a Robustly Optimized BERT Approach (RoBERTa), to not only detect but also classify ransomware targeting IoT devices by analyzing dynamically executed API call sequences in a sandbox environment. A total of 3300 samples from 10 ransomware families including 300 benign cases are analyzed dynamically in a sandbox environment. The newly created dataset is then preprocessed and fed to the BERT and RoBERTa models for training. The BERT achieved 95.60% accuracy with a minimal loss of 0.1650 while the RoBERTa achieved 94.39% accuracy with 0.1948 loss in classifying ransomware families. These results indicate that the proposed approach is game-changing in the classification of previously unidentified behavioral patterns inside ransomware and enhances the ability to tackle newly developing threats. By leveraging the dynamic analysis with API call sequences in a correct format, and training hyperparameter-optimized transformer learning-based models, the methodology efficiently captures behavioral patterns unique to ransomware. The research provides a scalable framework for integrating advanced detection mechanisms into real-world healthcare IoT systems, enhancing their resilience against cyber threats.
期刊介绍:
The Egyptian Informatics Journal is published by the Faculty of Computers and Artificial Intelligence, Cairo University. This Journal provides a forum for the state-of-the-art research and development in the fields of computing, including computer sciences, information technologies, information systems, operations research and decision support. Innovative and not-previously-published work in subjects covered by the Journal is encouraged to be submitted, whether from academic, research or commercial sources.