{"title":"A Novel Hybrid Attention-Based Dilated Network for Depression Classification Model from Multimodal Data Using Improved Heuristic Approach","authors":"B. Manjulatha, Suresh Pabboju","doi":"10.1142/s0219467826500105","DOIUrl":"https://doi.org/10.1142/s0219467826500105","url":null,"abstract":"Automatic depression classification from multimodal input data is a challenging task. Modern methods use paralinguistic information such as audio and video signals. Using linguistic information such as speech signals and text data for depression classification is a complicated task in deep learning models. Best audio and video features are built to produce a dependable depression classification system. Textual signals related to depression classification are analyzed using text-based content data. Moreover, to increase the achievements of the depression classification system, audio, visual, and text descriptors are used. So, a deep learning-based depression classification model is developed to detect the person with depression from multimodal data. The EEG signal, Speech signal, video, and text are gathered from standard databases. Four stages of feature extraction take place. In the first stage, the features from the decomposed EEG signals are attained by the empirical mode decomposition (EMD) method, and features are extracted by means of linear and nonlinear feature extraction. In the second stage, the spectral features of the speech signals from the Mel-frequency cepstral coefficients (MFCC) are extracted. In the third stage, the facial texture features from the input video are extracted. In the fourth stage of feature extraction, the input text data are pre-processed, and from the pre-processed data, the textual features are extracted by using the Transformer Net. All four sets of features are optimally selected and combined with the optimal weights to get the weighted fused features using the enhanced mountaineering team-based optimization algorithm (EMTOA). The optimal weighted fused features are finally given to the hybrid attention-based dilated network (HADN). The HDAN is developed by combining temporal convolutional network (TCN) with bidirectional long short-term memory (Bi-LSTM). The parameters in the HDAN are optimized with the assistance of the developed EMTOA algorithm. At last, the classified output of depression is obtained from the HDAN. The efficiency of the developed deep learning HDAN is validated by comparing it with various traditional classification models.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141662764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. H. Razzaq, Laith F. M. H. Al-Rammahi, Ahmed Mounaf Mahdi
{"title":"Modified Whale Algorithm and Morley PSO-ML-Based Hyperparameter Optimization for Intrusion Detection","authors":"H. H. Razzaq, Laith F. M. H. Al-Rammahi, Ahmed Mounaf Mahdi","doi":"10.1142/s0219467826500099","DOIUrl":"https://doi.org/10.1142/s0219467826500099","url":null,"abstract":"Intrusion detection averts a network from probable intrusions by inspecting network traffic to ensure its integrity, availability, and confidentiality. Though IDS seems to eliminate malicious traffic, intruders have endeavored to use different approaches for undertaking attacks. Hence, effective intrusion detection is vital to detect attacks. Concurrently, the evolvement of machine learning (ML), attacks could be identified by evaluating the patterns and learning from them. Considering this, conventional works have attempted to perform intrusion detection. Nevertheless, they lacked about high false alarm rate (FAR) and low accuracy rate due to inefficient feature selection. To resolve these existing pitfalls, this research proposed a modified whale algorithm (MWA) based on nonlinear information gain to select significant and relevant features. This algorithm assures huge initialization to improve local search ability as the agent’s positions are usually near the optimal solution. It is also utilized for an adaptive search for an optimal combination of features. Following this, the research proposes Morlet particle swarm optimization hyperparameter optimization (MPSO-HO) to improve the convergence rate of the algorithm by consenting it to produce from the local optimization by improving its capability. Standard metrics assess the proposed system to confirm the optimal performance of the proposed system. Outcomes explore the effective ability of the proposed system in intrusion detection.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141661134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Extensive Review on Lung Cancer Detection Models","authors":"Rajesh Singh","doi":"10.1142/s0219467825500317","DOIUrl":"https://doi.org/10.1142/s0219467825500317","url":null,"abstract":"The categorization and identification of lung disorders in medical imageries are made easier by recent advances in deep learning (DL). As a result, various studies using DL to identify lung illnesses were developed. This study aims to analyze different publications that have been contributed to in order to recognize lung cancer. This literature review examines the many methods for detecting lung cancer. It analyzes several segmentation models that have been used and reviews different research papers. It examines several feature extraction methods, such as those using texture-based and other features. The investigation then concentrates on several cancer detection strategies, including “DL models” and machine learning (ML) models. It is possible to examine and analyze the performance metrics. Finally, research gaps are presented to encourage additional investigation of lung detection models.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141664379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CMVT: ConVit Transformer Network Recombined with Convolutional Layer","authors":"Chunxia Mao, Jun Li, Tao Hu, Xu Zhao","doi":"10.1142/s0219467824500608","DOIUrl":"https://doi.org/10.1142/s0219467824500608","url":null,"abstract":"Vision transformers are deep neural networks applied to image classification based on a self-attention mechanism and can process data in parallel. Aiming at the structural loss of Vision transformers, this paper combines ConViT and Convolutional Neural Network (CNN) and proposes a new model Convolution Meet Vision Transformers (CMVT). This model adds a convolution module to the ConViT network to solve the structural loss of the transformer. By adding hierarchical data representation, the ability to gradually extract more image classification features is improved. We have conducted comparative experiments on multiple dataset, and all of them have been enhanced to improve the efficiency and performance of the model.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141006138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two-Phase Speckle Noise Removal in US Images: Speckle Reducing Improved Anisotropic Diffusion and Optimal Bayes Threshold","authors":"S. L. Shabana Sulthana, M. Sucharitha","doi":"10.1142/s0219467825500718","DOIUrl":"https://doi.org/10.1142/s0219467825500718","url":null,"abstract":"Medial images are contaminated by multiplicative speckle noise, which dramatically reduces ultrasound images and has a detrimental impact on a variety of image interpretation tasks. Hence, to overcome this issue, this paper presented a Two-Phase Speckle Reduction approach with Improved Anisotropic Diffusion and Optimal Bayes Threshold termed TPSR-IADOT, which includes the phases like image enhancement and two-level decomposition processes. Initially, the speckle noise is subjected to an image enhancement process where the Speckle Reducing Improved Anisotropic Diffusion (SRAID) filtering process is carried out for the speckle removal process. Afterwards, two-level decomposition takes place which utilizes Discrete Wavelet Transform (DWT) to remove the residual noise. As the speckle noise is mostly present in the high-frequency band, Improved Bayes Threshold will be applied to the high- frequency subbands. Finally, to provide the best outcomes, an optimization algorithm termed Self Improved Pelican Optimization Algorithm (SI-POA) in this work via choosing the optimal threshold value. The efficiency of the proposed method has been validated on an ultrasound image database using Simulink in terms of PSNR, SSIM, SDME and MAPE. Accordingly, from the analysis, it is proved that the proposed TPSR-IADOT attains the PSNR of 40.074, whereas the POA is 38.572, COOT is 38.572, BES is 37.003, PRO is 30.419, WOA is 33.218, RFU-LA is 29.935 and SSI-COA is 39.256, for noise variance[Formula: see text]0.1.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140664241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Double attention Res-U-Net-based Deep Neural Network Model for Automatic Detection of Tuberculosis in Human Lungs","authors":"M. Balamurugan, R. Balamurugan","doi":"10.1142/s0219467825500731","DOIUrl":"https://doi.org/10.1142/s0219467825500731","url":null,"abstract":"Tuberculosis (TB) stands as the leading cause of death and a significant threat to humanity in the contemporary world. Early detection of TB is crucial for precise identification and treatment, and Chest X-Rays (CXR) serve as a valuable tool in this regard. Computer-Aided Diagnosis (CAD) systems play a vital role in easing the classification process of active and latent TB. This paper uses an approach called the Double Attention Res-U-Net-based Deep Neural Network (DARUNDNN) to enhance TB detection in the lungs. The detection process involves pre-processing, noise removal, image level balancing, the application of the DARUNDNN model and using the Whale Optimization Algorithm (WOA) for improved accuracy. Experimental validation using Montgomery Country (MC), Shenzhen China (SC), and NIH CXR Datasets compares the results with U-Net, AlexNet, GoogleNet, and convolutional neural network (CNN) models. The findings, particularly from the SC dataset, demonstrate the efficiency of the proposed DARUNDNN model with an accuracy of 98.6%, specificity of 96.24%, and sensitivity of 97.66%, outperforming benchmarked deep learning models. Additionally, validation with the MC dataset reveals an excellent accuracy of 98%, specificity of 97.56%, and sensitivity of 98.52%.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140686229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Method for Analyzing the Operating Data of Electric Energy Meters Based on Data Mining Analysis","authors":"Chencheng Wang, Lijuan Pu, Zhihui Zhao, Zhang Jiefu","doi":"10.1142/s0219467826500014","DOIUrl":"https://doi.org/10.1142/s0219467826500014","url":null,"abstract":"Aiming at the problem of error estimation of smart meters in distribution network, a method of error estimation of smart meters based on particle swarm optimization convolutional neural network is proposed. This method establishes an intelligent energy meter error estimation model through data collection, data prediction, and preprocessing. To address the convergence issue in training, the interlayer distribution of weights is adjusted to improve training quality. This method fully utilizes template calibration information to transform indicator detection under complex conditions into simple and effective isometric segmentation, transforming label recognition from complex text detection and recognition tasks to simple and efficient binary detection tasks, with better robustness. The effectiveness and high robustness of the proposed method have been demonstrated through experimental verification.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140709856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PECT Composite Defect Detection Algorithm Based on DualGAN","authors":"Ming Gao, Zhiyan Zhou, Jinjie Huang, Kewei Ding","doi":"10.1142/s0219467825500706","DOIUrl":"https://doi.org/10.1142/s0219467825500706","url":null,"abstract":"To address the problems of insufficient accuracy and slow reconstruction speed of Planar Electrical Capacitance Tomography (PECT) detection of damaged specimens, a Dual Generative Adversarial Networks (DualGAN)-based PECT image defect detection method is proposed in this paper. The improved particle swarm algorithm with adaptive particle number and L2-norm is used to optimize the sensitivity field, combined with the parallel Landweber algorithm to solve the PECT inverse problem to obtain the dielectric constant distribution map. In the DualGAN network, the Unet generator utilizes an Adam-based local attention mechanism to adjust module weights, facilitating feature extraction and the generation of high-quality transformation images of the Landweber dielectric constant distribution. A PatchGAN discriminator is employed to distinguish between transformation images and real images, using the generated transformation images as target images. Experimental results demonstrate that the sensitivity field, enhanced by the improved particle swarm algorithm and L2-norm normalization, achieves better balance. Furthermore, the addition of a network transformation using the Adam-based local attention weight mechanism on the DualGAN network reduces artifacts in the reconstructed images, resulting in more accurate PECT reconstructions. The PECT image defect detection method, integrating DualGAN, an improved particle swarm optimization algorithm, and a local attention mechanism, has made significant strides in addressing challenges related to image reconstruction accuracy and speed. This technological advancement has enhanced the precision and efficiency of defect detection in carbon fiber composite materials, thereby fostering the broader utilization of planar capacitance tomography technology in industrial damage detection and material defect analysis.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140726200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Taylor Shepherd Golden Optimization-Enabled ResUNet for Forest Change Detection Using Satellite Images","authors":"K. R. Gite, Praveen Gupta","doi":"10.1142/s0219467825500688","DOIUrl":"https://doi.org/10.1142/s0219467825500688","url":null,"abstract":"The pivotal task of remote sensing image (RSI) processing change detection (CD) highly aims to accurately detect changes in land cover based on multi-temporal images. With the advent of deep learning, technology has delivered remarkable results in the last years in the detection of variations in forest land cover data. Some of the conventional CD techniques are weak and are highly susceptible to errors and can result even in inaccurate outcomes. Thus, certain techniques are not desirable for real-time CD applications. To abridge this gap, this research introduces an innovative work for forest CD utilizing the proposed Taylor Shepherd Golden Optimization_ResUNet (TSGO_ResUNet) and Fuzzy Neural network (Fuzzy NN) for segment mapping. Here, the segmentation process is accomplished using ResUNet to determine the exact boundary or shape of each object for every pixel in the image. Furthermore, TSGO is achieved by consolidating Taylor Shuffled Shepherd Optimization (TSSO) with Golden Search Optimization (GSO). In addition, the devised TSGO_ResUNet + Fuzzy NN has gained maximum accuracy and kappa coefficient of 0.952 and 0.785, and minimum error rate of 0.051.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140722280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stacked U-Net with Time–Frequency Attention and Deep Connection Net for Single Channel Speech Enhancement","authors":"Veeraswamy Parisae, S. Nagakishore Bhavanam","doi":"10.1142/s0219467825500676","DOIUrl":"https://doi.org/10.1142/s0219467825500676","url":null,"abstract":"Deep neural networks have significantly promoted the progress of speech enhancement technology. However, a great number of speech enhancement approaches are unable to fully utilize context information from various scales, hindering performance enhancement. To tackle this issue, we introduce a method called TFADCSU-Net (Stacked U-Net with Time-Frequency Attention (TFA) and Deep Connection Layer (DCL)) for enhancing noisy speech in the time–frequency domain. TFADCSU-Net adopts an encoder-decoder structure with skip links. Within TFADCSU-Net, a multiscale feature extraction layer (MSFEL) is proposed to effectively capture contextual data from various scales. This allows us to leverage both global and local speech features to enhance the reconstruction of speech signals. Moreover, we incorporate deep connection layer and TFA mechanisms into the network to further improve feature extraction and aggregate utterance level context. The deep connection layer effectively captures rich and precise features by establishing direct connections starting from the initial layer to all subsequent layers, rather than relying on connections from earlier layers to subsequent layers. This approach not only enhances the information flow within the network but also avoids a significant rise in computational complexity as the number of network layers increases. The TFA module consists of two attention branches operating concurrently: one directed towards the temporal dimension and the other towards the frequency dimension. These branches generate distinct forms of attention — one for identifying relevant time frames and another for selecting frequency wise channels. These attention mechanisms assist the models in discerning “where” and “what” to prioritize. Subsequently, the TA and FA branches are combined to produce a comprehensive attention map in two dimensions. This map assigns specific attention weights to individual spectral components in the time–frequency representation, enabling the networks to proficiently capture the speech characteristics in the T-F representation. The results confirm that the proposed method outperforms other models in terms of objective speech quality as well as intelligibility.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140726805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}