Tianshu Zhang, Ruidan Su, Anli Zhong, Minwei Fang, Yu-dong Zhang
{"title":"From Data to Deployment: A Comprehensive Analysis of Risks in Large Language Model Research and Development","authors":"Tianshu Zhang, Ruidan Su, Anli Zhong, Minwei Fang, Yu-dong Zhang","doi":"10.1049/ise2/7358963","DOIUrl":"https://doi.org/10.1049/ise2/7358963","url":null,"abstract":"<div>\u0000 <p>Large language models (LLMs) have evolved significantly, achieving unprecedented linguistic capabilities that underpin a wide range of AI applications. However, they also pose risks and challenges such as ethical concerns, bias and computational sustainability. How to balance the high performance in revolutionising information processing with the risks they pose is critical to their future development. LLM is a type of NLP model and many of the LLM risks are also risks that NLP has experienced in the past. We, therefore, summarise these risks, focusing more on the underlying understanding of these risks/technical tools, rather than simply describing their occurrence in LLM. In this paper, we first discuss and compare the current state of research on the four main risks in the process of developing LLMs: data, system, pretraining and inference, and then, try to summarise the rationale, complexity, prospects and challenges of the key issues and challenges in each phase. Finally, this review concludes with a discussion of the fundamental issues that should be of most concern and risk and that should be addressed in the early stages of modelling research, including the correlated issues of privacy preservation and countering attacks and model robustness. Based on the LLM research and development (R&D) process perspective, this review summarises the actual risks and provides guidance for research directions, with the aim of helping researchers to identify these risk points and technology directions worth investigating, as well as helping to establish a safe and efficient R&D process.</p>\u0000 </div>","PeriodicalId":50380,"journal":{"name":"IET Information Security","volume":"2025 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ise2/7358963","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144367215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generic Construction of Dual-Server Public Key Authenticated Encryption With Keyword Search","authors":"Keita Emura","doi":"10.1049/ise2/6610587","DOIUrl":"https://doi.org/10.1049/ise2/6610587","url":null,"abstract":"<div>\u0000 <p>In this paper, we propose a generic construction of dual-server public key authenticated encryption with keyword search (DS-PAEKS) from PAEKS, public key encryption, and signatures. We also show that previous DS-PAEKS scheme is vulnerable by providing a concrete attack. That is, the proposed generic construction yields the first DS-PAEKS schemes. Our attack with a slight modification works against previous dual-server public key encryption with keyword search (DS-PEKS) schemes.</p>\u0000 </div>","PeriodicalId":50380,"journal":{"name":"IET Information Security","volume":"2025 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ise2/6610587","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144281579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Graph Representation Learning-Based Method for Event Prediction","authors":"Xi Zeng, Guangchun Luo, Ke Qin, Pengyi Zheng","doi":"10.1049/ise2/9706647","DOIUrl":"https://doi.org/10.1049/ise2/9706647","url":null,"abstract":"<div>\u0000 <p>With the continuous advancement of big data and artificial intelligence technologies, event prediction is increasingly being utilized across a multitude of domains. Predicting events allows for the exploration of the developmental trajectories and summarization of patterns associated with these events. However, events typically encompass a myriad of elements and intricate relationships, necessitating an enhancement in the precision of event prediction. However, the existing methods suffer from poor data quality, insufficient feature information, limited generalization capability of the models, and difficulties in evaluating prediction errors. This paper proposes a novel event prediction method based on graph representation learning, aiming to improve the accuracy of event prediction while reducing the time cost. By constructing causal graphs and introducing the script event simulation method, the architecture combines graph neural networks (GNNs) with BERT to simplify the event prediction process. Additionally, by combining GNNs with pretrained language models, a dynamic graph representation learning method is proposed. This means that a unified graph representation learning model can be built by following specific rules, thus predicting the development trajectory of events more accurately. The study evaluates the effectiveness of dynamic graph representation learning technology in a specific scenario, specifically in the context of employee career choices. By converting the career graph of employees into low-dimensional representations, the effectiveness of the dynamic graph representation learning method in predicting employee career decisions is validated. This innovation not only improves the accuracy of event prediction but also helps better understand and respond to complex event relationships in practical applications, providing decision-makers with more powerful information support. Therefore, this research has important theoretical and practical significance, providing valuable references for future studies in related fields.</p>\u0000 </div>","PeriodicalId":50380,"journal":{"name":"IET Information Security","volume":"2025 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ise2/9706647","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144237315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Feature Graph Construction With Static Features for Malware Detection","authors":"Binghui Zou, Chunjie Cao, Longjuan Wang, Yinan Cheng, Chenxi Dang, Ying Liu, Jingzhang Sun","doi":"10.1049/ise2/6687383","DOIUrl":"https://doi.org/10.1049/ise2/6687383","url":null,"abstract":"<div>\u0000 <p>Malware can greatly compromise the integrity and trustworthiness of information and is in a constant state of evolution. Existing feature fusion-based detection methods generally overlook the correlation between features. And mere concatenation of features will reduce the model’s characterization ability, lead to low detection accuracy. Moreover, these methods are susceptible to concept drift and significant degradation of the model. To address those challenges, we introduce a feature graph-based malware detection method, malware feature graph (MFGraph), to characterize applications by learning feature-to-feature relationships to achieve improved detection accuracy while mitigating the impact of concept drift. In MFGraph, we construct a feature graph using static features extracted from binary PE files, then apply a deep graph convolutional network to learn the representation of the feature graph. Finally, we employ the representation vectors obtained from the output of a three-layer perceptron to differentiate between benign and malicious software. We evaluated our method on the EMBER dataset, and the experimental results demonstrate that it achieves an AUC score of 0.98756 on the malware detection task, outperforming other baseline models. Furthermore, the AUC score of MFGraph decreases by only 5.884% in 1 year, indicating that it is the least affected by concept drift.</p>\u0000 </div>","PeriodicalId":50380,"journal":{"name":"IET Information Security","volume":"2025 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ise2/6687383","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144171378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Method for Constructing Integral-Resistance Matrix for 5-Round AES","authors":"Fanyang Zeng, Tian Tian","doi":"10.1049/ise2/3447652","DOIUrl":"https://doi.org/10.1049/ise2/3447652","url":null,"abstract":"<div>\u0000 <p>A powerful theory for evaluating block ciphers against integral distinguishers was introduced by Hebborn et al. at ASIACRYPT 2021. To show the integral-resistance property for a block cipher, their core idea is to construct a full-rank integral-resistance matrix. However, their method does not work practically for 5-round AES due to the large S-box and complex linear layer. In this paper, we are concerned with the integral-resistance property of 5-round AES. By carefully investigating the S-box and the linear layer of AES, some significant properties about the propagation of the division property on the round function of AES are derived. In particular, with these properties, it is easy to determine the appearance of all maximum-degree monomials after 5-round AES encryption on a properly chosen set of key-patterns. Consequently, a full-rank integral-resistance matrix is formed to show that there is no integral distinguisher for five rounds and higher of AES under the assumption of independent round keys. Since it is well known that there is a 4-round integral distinguisher for AES, our result is tight for AES. As far as we know, this is the first proof for the integral-resistance property of 5-round AES.</p>\u0000 </div>","PeriodicalId":50380,"journal":{"name":"IET Information Security","volume":"2025 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ise2/3447652","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144085419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Topic Words-Based Multilingual Hateful Linguistic Resources Construction for Developing Multilingual Hateful Content Detection Model Using Deep Learning Technique","authors":"Naol Bakala Defersha, Kula Kekeba Tune, Solomon Teferra Abate","doi":"10.1049/ise2/6068177","DOIUrl":"https://doi.org/10.1049/ise2/6068177","url":null,"abstract":"<div>\u0000 <p>Nowadays, social media platforms provide space that allows communication and sharing of various resources using a variety of natural languages in different cultural and multilingual aspects. Although this interconnectedness offers numerous benefits, it also exposes users to the risk of encountering offensive (OFFN) and harmful content, including hateful speech. In order to create a model for detecting hateful content in resource-rich languages, lexicons, word embedding, topic modeling, and transformer language models were applied. Low-resource languages, including Ethiopian languages, suffering in lack of such linguistic resources. Multilingual hateful content detection brings complex challenges due to cultural and linguistic varieties. The paper proposes a multilingual hateful content identification model using a transformer language model and hybrid lexicon techniques to enhance hateful content recognition in low-resource Ethiopian languages. First, hateful content disseminated on Facebook in Ethiopian-languages target was identified as (insult, identity hate, antagonistic, and threat) using topic modeling techniques. Then, we compiled different hateful terms from sources such as guidelines and proclamations related to the Ethiopian context. We created Ethiopian context-based transformer language models. We utilized topic words-based datasets to construct pretrained transformer language models and multilingual lexicons of major Ethiopian languages. Finally, their performance was compared by integrating them into deep learning-based low-resource Ethiopian languages’ hateful content detection framework. Among applied deep learning algorithms with Ethiopian language linguistic resources, word2vec-based multilingual lexicons with convolutional neural network (CNN) outperform than others. The result indicated that constructing topic words based multilingual word2vec lexicons outperformed than transformers language model based on topics modeling for low-resource Ethiopian languages, effectively produce the promising hate speech (HATE) detection approach of low-resource Ethiopian languages.</p>\u0000 </div>","PeriodicalId":50380,"journal":{"name":"IET Information Security","volume":"2025 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ise2/6068177","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143818410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fujun Qiu, Ashwini Kumar, Jiang Hu, Poorva Sharma, Yu Bing Tang, Yang Xu Xiang, Jie Hong
{"title":"A Review on Integrating IoT, IIoT, and Industry 4.0: A Pathway to Smart Manufacturing and Digital Transformation","authors":"Fujun Qiu, Ashwini Kumar, Jiang Hu, Poorva Sharma, Yu Bing Tang, Yang Xu Xiang, Jie Hong","doi":"10.1049/ise2/9275962","DOIUrl":"https://doi.org/10.1049/ise2/9275962","url":null,"abstract":"<div>\u0000 <p>The industrial Internet of Things (IIoT) has become an innovative technology that has brought many benefits to industries and organizations. This review presents a comprehensive analysis of IIoT’s applications, highlighting its ability to optimize industrial operations through advanced connectivity, real-time data exchange, automation, and its importance in the context of Industry 4.0. Emphasizing the distinction between IIoT and traditional IoT, the paper explores how IIoT focuses on enhancing industrial ecosystems and integrating cyber-physical systems (CPSs). This article explains how to establish a highly linked infrastructure to support cutting-edge services and ensure greater flexibility and efficiency. It emphasizes the role of the CPS and industrial automation and control systems (IACSs) in realizing the potential of IIoT. Security concerns, an important part of IIoT, are addressed through conversations on protecting networked systems, assuring operational reliability, and emphasizing the need for strong security measures to prevent potential threats and vulnerabilities. Furthermore, critical technologies such as machine learning (ML), artificial intelligence (AI), and various communication protocols, including fifth generation (5G) and message queuing telemetry transport (MQTT), are investigated for their potential to improve system performance and decision-making processes. In addition, the article also discusses the safety precautions and challenges of using IIoT. Finally, the article emphasizes the importance of addressing security issues in promoting the successful adoption of the IIoT and achieving its expected benefits. This study offers valuable resources for researchers, academics, and decision-makers to implement IIoT in industrial environments.</p>\u0000 </div>","PeriodicalId":50380,"journal":{"name":"IET Information Security","volume":"2025 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ise2/9275962","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143707395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Pattern Matching on Encrypted Data With Forward and Backward Security","authors":"Xiaolu Chu, Ke Cheng, Anxiao Song, Jiaxuan Fu","doi":"10.1049/ise2/5523834","DOIUrl":"https://doi.org/10.1049/ise2/5523834","url":null,"abstract":"<div>\u0000 <p>Pattern matching is widely used in applications such as genomic data query analysis, network intrusion detection, and deep packet inspection (DPI). Performing pattern matching on plaintext data is straightforward, but the need to protect the security of analyzed data and analyzed patterns can significantly complicate the process. Due to the privacy security issues of data and patterns, researchers begin to explore pattern matching on encrypted data. However, existing solutions are typically built on static pattern matching methods, lacking dynamism, namely, the inability to perform addition or deletion operations on the analyzed data. This lack of flexibility might hinder the adaptability and effectiveness of pattern matching on encrypted data in the real-world scenarios. In this paper, we design a dynamic pattern matching scheme on encrypted data with forward and backward security, which introduces much-needed dynamism. Our scheme is able to implement the addition operation and the deletion operation on the encrypted data without affecting the security of the original pattern matching scheme. Specifically, we design secure addition and deletion algorithms based on fragmentation data structures, which are compatible with the static pattern matching scheme. Moreover, we make significant improvements to the key generation algorithm, the encryption algorithm, and the match algorithm of the static scheme to ensure forward and backward security. Theoretical analysis proves that our scheme satisfies forward and backward security while ensuring the nonfalsifiability of encrypted data. The experimental results show that our scheme has a slight increase in time cost compared to the static pattern matching scheme, demonstrating its practicality and effectiveness in dynamic scenarios.</p>\u0000 </div>","PeriodicalId":50380,"journal":{"name":"IET Information Security","volume":"2025 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ise2/5523834","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143594923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BF-ACS—Intelligent and Immutable Face Recognition Access Control System","authors":"Wen-Bin Hsieh","doi":"10.1049/ise2/6755170","DOIUrl":"https://doi.org/10.1049/ise2/6755170","url":null,"abstract":"<div>\u0000 <p>Biometric authentication is adopted in many access control scenarios in recent years. It is very convenient and secure since it compares the user’s own biometrics with those stored in the database to confirm their identification. Since then, with the vigorous development of machine learning, the performance and accuracy of biometric authentication have been greatly improved. Face recognition technology combined with convolutional neural network (CNN) is extremely efficient and has become the mainstream of access control systems (ACSs). However, identity information and access logs stored in traditional databases can be tampered by malicious insiders. Therefore, we propose a face recognition ACS that is resistant to data forgery. In this paper, a deep convolutional network is utilized to learn Euclidean embedding (based on FaceNet) of each image and achieve face recognition and verification. Quorum, which is built on the Ethereum blockchain, is used to store facial feature vectors and login information. Smart contracts are made to automatically put data into blocks on the chain. One is used to store feature vectors, and the other to record the arrival and departure times of employees. By combining these cutting-edge technologies, an intelligent and immutable ACS that can withstand distributed denial-of-service (DDoS) and other internal and external attacks is created. Finally, an experiment is conducted to assess the effectiveness of the proposed system to demonstrate its practicality.</p>\u0000 </div>","PeriodicalId":50380,"journal":{"name":"IET Information Security","volume":"2025 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ise2/6755170","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143533566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Bai, Yutang Rao, Hongyan Wu, Juan Wang, Wentao Yang, Gaojie Xing, Jiawei Yang, Xiaoshu Yuan
{"title":"Using Homomorphic Proxy Re-Encryption to Enhance Security and Privacy of Federated Learning-Based Intelligent Connected Vehicles","authors":"Yang Bai, Yutang Rao, Hongyan Wu, Juan Wang, Wentao Yang, Gaojie Xing, Jiawei Yang, Xiaoshu Yuan","doi":"10.1049/ise2/4632786","DOIUrl":"https://doi.org/10.1049/ise2/4632786","url":null,"abstract":"<div>\u0000 <p>Intelligent connected vehicles (ICVs) are one of the fast-growing directions that plays a significant role in the area of autonomous driving. To realize collaborative computation among ICVs, federated learning (FL) or federated-based large language model (FedLLM) as a promising distributed approach has been used to support various collaborative application computations in ICVs scenarios, for example, analyzing vehicle driving information to realize trajectory prediction, voice-activated controls, conversational AI assistants. Unfortunately, recent research reveals that FL systems are still faced with privacy challenges from honest-but-curious server, honest-but-curious distributed participants, or the collusion between participants and the server. These threats can lead to the leakage of sensitive private data, such as location information and driving conditions. Homomorphic encryption (HE) is one of the typical mitigation that has few effects on the model accuracy and has been studied before. However, single-key HE cannot resist collusion between participants and the server, multikey HE is not suitable for ICVs scenarios. In this work, we proposed a novel approach that combines FL with homomorphic proxy re-encryption (PRE) which is based on participants’ ID information. By doing so, the FL-based ICVs can be able to successfully defend against privacy threats. In addition, we analyze the security and performance of our method, and the theoretical analysis and the experiment results show that our defense framework with ID-based homomorphic PRE can achieve a high-security level and efficient computation. We anticipate that our approach can serve as a fundamental point to support the extensive research on FedLLMs privacy-preserving.</p>\u0000 </div>","PeriodicalId":50380,"journal":{"name":"IET Information Security","volume":"2025 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ise2/4632786","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143533565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}