{"title":"Bayesian network structure learning by dynamic programming algorithm based on node block sequence constraints","authors":"Chuchao He, Ruohai Di, Bo Li, Evgeny Neretin","doi":"10.1049/cit2.12363","DOIUrl":"https://doi.org/10.1049/cit2.12363","url":null,"abstract":"<p>The use of dynamic programming (DP) algorithms to learn Bayesian network structures is limited by their high space complexity and difficulty in learning the structure of large-scale networks. Therefore, this study proposes a DP algorithm based on node block sequence constraints. The proposed algorithm constrains the traversal process of the parent graph by using the M-sequence matrix to considerably reduce the time consumption and space complexity by pruning the traversal process of the order graph using the node block sequence. Experimental results show that compared with existing DP algorithms, the proposed algorithm can obtain learning results more efficiently with less than 1% loss of accuracy, and can be used for learning larger-scale networks.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 6","pages":"1605-1622"},"PeriodicalIF":8.4,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12363","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143252980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Abrar Ahmad Khan, Muhammad Attique Khan, Ateeq Ur Rehman, Ahmed Ibrahim Alzahrani, Nasser Alalwan, Deepak Gupta, Saima Ahmed Rahin, Yudong Zhang
{"title":"BAHGRF3: Human gait recognition in the indoor environment using deep learning features fusion assisted framework and posterior probability moth flame optimisation","authors":"Muhammad Abrar Ahmad Khan, Muhammad Attique Khan, Ateeq Ur Rehman, Ahmed Ibrahim Alzahrani, Nasser Alalwan, Deepak Gupta, Saima Ahmed Rahin, Yudong Zhang","doi":"10.1049/cit2.12368","DOIUrl":"https://doi.org/10.1049/cit2.12368","url":null,"abstract":"<p>Biometric characteristics are playing a vital role in security for the last few years. Human gait classification in video sequences is an important biometrics attribute and is used for security purposes. A new framework for human gait classification in video sequences using deep learning (DL) fusion assisted and posterior probability-based moth flames optimization (MFO) is proposed. In the first step, the video frames are resized and fine-tuned by two pre-trained lightweight DL models, EfficientNetB0 and MobileNetV2. Both models are selected based on the top-5 accuracy and less number of parameters. Later, both models are trained through deep transfer learning and extracted deep features fused using a voting scheme. In the last step, the authors develop a posterior probability-based MFO feature selection algorithm to select the best features. The selected features are classified using several supervised learning methods. The CASIA-B publicly available dataset has been employed for the experimental process. On this dataset, the authors selected six angles such as 0°, 18°, 90°, 108°, 162°, and 180° and obtained an average accuracy of 96.9%, 95.7%, 86.8%, 90.0%, 95.1%, and 99.7%. Results demonstrate comparable improvement in accuracy and significantly minimize the computational time with recent state-of-the-art techniques.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 2","pages":"387-401"},"PeriodicalIF":8.4,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12368","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143856862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hazique Aetesam, Suman Kumar Maji, V. B. Surya Prasath
{"title":"Hyperspectral image restoration using noise gradient and dual priors under mixed noise conditions","authors":"Hazique Aetesam, Suman Kumar Maji, V. B. Surya Prasath","doi":"10.1049/cit2.12355","DOIUrl":"https://doi.org/10.1049/cit2.12355","url":null,"abstract":"<p>Images obtained from hyperspectral sensors provide information about the target area that extends beyond the visible portions of the electromagnetic spectrum. However, due to sensor limitations and imperfections during the image acquisition and transmission phases, noise is introduced into the acquired image, which can have a negative impact on downstream analyses such as classification, target tracking, and spectral unmixing. Noise in hyperspectral images (HSI) is modelled as a combination from several sources, including Gaussian/impulse noise, stripes, and deadlines. An HSI restoration method for such a mixed noise model is proposed. <i>First</i>, a joint optimisation framework is proposed for recovering hyperspectral data corrupted by mixed Gaussian-impulse noise by estimating both the clean data as well as the sparse/impulse noise levels. <i>Second</i>, a hyper-Laplacian prior is used along both the spatial and spectral dimensions to express sparsity in clean image gradients. <i>Third</i>, to model the sparse nature of impulse noise, an <i>ℓ</i><sub>1</sub> − norm over the impulse noise gradient is used. Because the proposed methodology employs two distinct priors, the authors refer to it as the hyperspectral dual prior <i>(HySpDualP)</i> denoiser. To the best of authors' knowledge, this joint optimisation framework is the first attempt in this direction. To handle the non-smooth and non-convex nature of the general <i>ℓp</i> − norm-based regularisation term, a generalised shrinkage/thresholding (GST) solver is employed. <i>Finally</i>, an efficient split-Bregman approach is used to solve the resulting optimisation problem. Experimental results on synthetic data and real HSI datacube obtained from hyperspectral sensors demonstrate that the authors’ proposed model outperforms state-of-the-art methods, both visually and in terms of various image quality assessment metrics.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 1","pages":"72-93"},"PeriodicalIF":8.4,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12355","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinfu Liu, Runwei Ding, Yuhang Wen, Nan Dai, Fanyang Meng, Fang-Lue Zhang, Shen Zhao, Mengyuan Liu
{"title":"Explore human parsing modality for action recognition","authors":"Jinfu Liu, Runwei Ding, Yuhang Wen, Nan Dai, Fanyang Meng, Fang-Lue Zhang, Shen Zhao, Mengyuan Liu","doi":"10.1049/cit2.12366","DOIUrl":"https://doi.org/10.1049/cit2.12366","url":null,"abstract":"<p>Multimodal-based action recognition methods have achieved high success using pose and RGB modality. However, skeletons sequences lack appearance depiction and RGB images suffer irrelevant noise due to modality limitations. To address this, the authors introduce human parsing feature map as a novel modality, since it can selectively retain effective semantic features of the body parts while filtering out most irrelevant noise. The authors propose a new dual-branch framework called ensemble human parsing and pose network (EPP-Net), which is the first to leverage both skeletons and human parsing modalities for action recognition. The first human pose branch feeds robust skeletons in the graph convolutional network to model pose features, while the second human parsing branch also leverages depictive parsing feature maps to model parsing features via convolutional backbones. The two high-level features will be effectively combined through a late fusion strategy for better action recognition. Extensive experiments on NTU RGB + D and NTU RGB + D 120 benchmarks consistently verify the effectiveness of our proposed EPP-Net, which outperforms the existing action recognition methods. Our code is available at https://github.com/liujf69/EPP-Net-Action.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 6","pages":"1623-1633"},"PeriodicalIF":8.4,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12366","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143252664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Guest Editorial: Knowledge-based deep learning system in bio-medicine","authors":"Yu-Dong Zhang, Juan Manuel Górriz","doi":"10.1049/cit2.12364","DOIUrl":"https://doi.org/10.1049/cit2.12364","url":null,"abstract":"<p>Numerous healthcare procedures can be viewed as medical sector decisions. In the modern era, computers have become indispensable in the realm of medical decision-making. However, the common view of computers in the medical field typically extends only to applications that support doctors in diagnosing diseases. To more tightly intertwine computers with the biomedical sciences, professionals are now more frequently utilising knowledge-driven deep learning systems (KDLS) and their foundational technologies, especially in the domain of neuroimaging (NI).</p><p>Data for medical purposes can be sourced from a variety of imaging techniques, including but not limited to Computed Tomography (CT), Magnetic Resonance Imaging (MRI), Ultrasound, Single Photon Emission Computed Tomography (SPECT), Positron Emission Tomography (PET), Magnetic Particle Imaging (MPI), Electroencephalography (EEG), Magnetoencephalography (MEG), Optical Microscopy and Tomography, Photoacoustic Tomography, Electron Tomography, and Atomic Force Microscopy.</p><p>Historically, these imaging techniques have been analysed using traditional statistical methods, such as hypothesis testing or Bayesian inference, which often presuppose certain conditions that are not always met. An emerging solution is the implementation of machine learning (ML) within the context of KDLS, allowing for the empirical mapping of complex, multi-dimensional relationships within data sets.</p><p>The objective of this special issue is to showcase the latest advancements in the methodology of KDLS for evaluating functional connectivity, neurological disorders, and clinical neuroscience, such as conditions like Alzheimer's, Parkinson's, cerebrovascular accidents, brain tumours, epilepsy, multiple sclerosis, ALS, Autism Spectrum Disorder, and more. Additionally, the special issue seeks to elucidate the mechanisms behind the predictive capabilities of ML methods within KDLS for brain-related diseases and disorders.</p><p>We received an abundance of submissions, totalling more than 40, from over 10 countries. After a meticulous and rigorous peer review process, which employed a double-blind methodology, we ultimately selected eight outstanding papers for publication. This process ensured the highest standards of quality and impartiality in the selection.</p><p>In the article ‘A deep learning fusion model for accurate classification of brain tumours in Magnetic Resonance images’, Zebari et al. created a robust deep learning (DL) fusion model for accurate brain tumour classification. To enhance performance, they employed data augmentation to expand the training dataset. The model leveraged VGG16, ResNet50, and convolutional deep belief networks to extract features from MRI images using a softmax classifier. By fusing features from two DL models, the fusion model notably boosted classification precision. Tested with a publicly available dataset, it achieved a remarkable 98.98% accuracy rate, outperforming existing me","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 4","pages":"787-789"},"PeriodicalIF":8.4,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12364","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142007148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DRRN: Differential rectification & refinement network for ischemic infarct segmentation","authors":"Wenxue Zhou, Wenming Yang, Qingmin Liao","doi":"10.1049/cit2.12350","DOIUrl":"10.1049/cit2.12350","url":null,"abstract":"<p>Accurate segmentation of infarct tissue in ischemic stroke is essential to determine the extent of injury and assess the risk and choose optimal treatment for this life-threatening disease. With the prior knowledge that asymmetric analysis of anatomical structures can provide discriminative information, plenty of symmetry-based approaches have emerged to detect abnormalities in brain images. However, the inevitable non-pathological noise has not been fully alleviated and weakened, leading to unsatisfactory results. A novel differential rectification and refinement network (DRRN) for the automatic segmentation of ischemic strokes is proposed. Specifically, a differential feature perception encoder (DFPE) is developed to fully exploit and propagate the bilateral quasi-symmetry of healthy brains. In DFPE, an erasure-rectification (ER) module is devised to rectify pseudo-lesion features caused by non-pathological noise through utilising discriminant features within the symmetric neighbourhood of the original image. And a differential-attention (DA) mechanism is also integrated to fully perceive the differences in cross-axial features and estimate the similarity of long-range spatial context information. In addition, a crisscross differential feature reinforce module embedded with multiple boundary enhancement attention modules is designed to effectively integrate multi-scale features and refine textual details and margins of the infarct area. Experimental results on the public ATLAS and Kaggle dataset demonstrate the effectiveness of DRRN over state-of-the-arts.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 6","pages":"1534-1547"},"PeriodicalIF":8.4,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12350","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141807294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Minh Tam Pham, Thanh Trung Huynh, Thanh Tam Nguyen, Thanh Toan Nguyen, Thanh Thi Nguyen, Jun Jo, Hongzhi Yin, Quoc Viet Hung Nguyen
{"title":"A dual benchmarking study of facial forgery and facial forensics","authors":"Minh Tam Pham, Thanh Trung Huynh, Thanh Tam Nguyen, Thanh Toan Nguyen, Thanh Thi Nguyen, Jun Jo, Hongzhi Yin, Quoc Viet Hung Nguyen","doi":"10.1049/cit2.12362","DOIUrl":"10.1049/cit2.12362","url":null,"abstract":"<p>In recent years, visual facial forgery has reached a level of sophistication that humans cannot identify fraud, which poses a significant threat to information security. A wide range of malicious applications have emerged, such as deepfake, fake news, defamation or blackmailing of celebrities, impersonation of politicians in political warfare, and the spreading of rumours to attract views. As a result, a rich body of visual forensic techniques has been proposed in an attempt to stop this dangerous trend. However, there is no comprehensive, fair, and unified performance evaluation to enlighten the community on best performing methods. The authors present a systematic benchmark beyond traditional surveys that provides in-depth insights into facial forgery and facial forensics, grounding on robustness tests such as contrast, brightness, noise, resolution, missing information, and compression. The authors also provide a practical guideline of the benchmarking results, to determine the characteristics of the methods that serve as a comparative reference in this never-ending war between measures and countermeasures. The authors’ source code is open to the public.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 6","pages":"1377-1397"},"PeriodicalIF":8.4,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12362","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141674328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Norm-based zeroing neural dynamics for time-variant non-linear equations","authors":"Linyan Dai, Hanyi Xu, Yinyan Zhang, Bolin Liao","doi":"10.1049/cit2.12360","DOIUrl":"10.1049/cit2.12360","url":null,"abstract":"<p>Zeroing neural dynamic (ZND) model is widely deployed for time-variant non-linear equations (TVNE). Various element-wise non-linear activation functions and integration operations are investigated to enhance the convergence performance and robustness in most proposed ZND models for solving TVNE, leading to a huge cost of hardware implementation and model complexity. To overcome these problems, the authors develop a new norm-based ZND (NBZND) model with strong robustness for solving TVNE, not applying element-wise non-linear activated functions but introducing a two-norm operation to achieve finite-time convergence. Moreover, the authors develop a discrete-time NBZND model for the potential deployment of the model on digital computers. Rigorous theoretical analysis for the NBZND is provided. Simulation results substantiate the advantages of the NBZND model for solving TVNE.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 6","pages":"1561-1571"},"PeriodicalIF":8.4,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12360","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141682684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Hossein Modirrousta, Parisa Forghani Arani, Reza Kazemi, Mahdi Aliyari-Shoorehdeli
{"title":"Analysis of anomalous behaviour in network systems using deep reinforcement learning with convolutional neural network architecture","authors":"Mohammad Hossein Modirrousta, Parisa Forghani Arani, Reza Kazemi, Mahdi Aliyari-Shoorehdeli","doi":"10.1049/cit2.12359","DOIUrl":"https://doi.org/10.1049/cit2.12359","url":null,"abstract":"<p>To gain access to networks, various intrusion attack types have been developed and enhanced. The increasing importance of computer networks in daily life is a result of our growing dependence on them. Given this, it is glaringly obvious that algorithmic tools with strong detection performance and dependability are required for a variety of attack types. The objective is to develop a system for intrusion detection based on deep reinforcement learning. On the basis of the Markov decision procedure, the developed system can construct patterns appropriate for classification purposes based on extensive amounts of informative records. Deep Q-Learning (DQL), Soft DQL, Double DQL, and Soft double DQL are examined from two perspectives. An evaluation of the authors’ methods using UNSW-NB15 data demonstrates their superiority regarding accuracy, precision, recall, and F1 score. The validity of the model trained on the UNSW-NB15 dataset was also checked using the BoT-IoT and ToN-IoT datasets, yielding competitive results.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 6","pages":"1467-1484"},"PeriodicalIF":8.4,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12359","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143253352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integer wavelet transform-based secret image sharing using rook polynomial and hamming code with authentication","authors":"Sara Charoghchi, Zahra Saeidi, Samaneh Mashhadi","doi":"10.1049/cit2.12357","DOIUrl":"https://doi.org/10.1049/cit2.12357","url":null,"abstract":"<p>As an effective way to securely transfer secret images, secret image sharing (SIS) has been a noteworthy area of research. Basically in a SIS scheme, a secret image is shared via shadows and could be reconstructed by having the required number of them. A major downside of this method is its noise-like shadows, which draw the malicious users' attention. In order to overcome this problem, SIS schemes with meaningful shadows are introduced in which the shadows are first hidden in innocent-looking cover images and then shared. In most of these schemes, the cover image cannot be recovered without distortion, which makes them useless in case of utilising critical cover images such as military or medical images. Also, embedding the secret data in Least significant bits of the cover image, in many of these schemes, makes them very fragile to steganlysis. A reversible IWT-based SIS scheme using Rook polynomial and Hamming code with authentication is proposed. In order to make the scheme robust to steganalysis, the shadow image is embedded in coefficients of Integer wavelet transform of the cover image. Using Rook polynomial makes the scheme more secure and moreover makes authentication very easy and with no need to share private key to participants. Also, utilising Hamming code lets us embed data with much less required modifications on the cover image which results in high-quality stego images.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 6","pages":"1435-1450"},"PeriodicalIF":8.4,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12357","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143253346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}