Multimedia Tools and Applications最新文献

筛选
英文 中文
CSDNet: cross-sketch with dual gated attention for fine-grained image captioning network CSDNet:针对细粒度图像标题网络的交叉草图与双重门控注意力
IF 3.6 4区 计算机科学
Multimedia Tools and Applications Pub Date : 2024-09-16 DOI: 10.1007/s11042-024-20220-z
Md. Shamim Hossain, Shamima Aktar, Md. Bipul Hossen, Mohammad Alamgir Hossain, Naijie Gu, Zhangjin Huang
{"title":"CSDNet: cross-sketch with dual gated attention for fine-grained image captioning network","authors":"Md. Shamim Hossain, Shamima Aktar, Md. Bipul Hossen, Mohammad Alamgir Hossain, Naijie Gu, Zhangjin Huang","doi":"10.1007/s11042-024-20220-z","DOIUrl":"https://doi.org/10.1007/s11042-024-20220-z","url":null,"abstract":"<p>In the realm of extracting inter and intra-modal interactions, contemporary models often face challenges such as reduced computational efficiency, particularly when dealing with lengthy visual sequences. To address these issues, this study introduces an innovative model, the Cross-Sketch with Dual Gated Attention Network (CSDNet), designed to handle second-order intra- and inter-modal interactions by integrating a couple of attention modules. Leveraging bilinear pooling to effectively capture these second-order interactions typically requires substantial computational resources due to the processing of large-dimensional tensors. Due to these resource demands, the first module Cross-Sketch Attention (CSA) is proposed, which employs Cross-Tensor Sketch Pooling on attention features to reduce dimensionality while preserving crucial information without sacrificing caption quality. Furthermore, to enhance caption by integrating another novel attention module, Dual Gated Attention (DGA), which contributes additional spatial and channel-wise attention distributions to improve caption generation performance. Our method demonstrates significant computational efficiency improvements, reducing computation time per epoch by an average of 13.54% compared to the base model, which leads to expedited convergence and improved performance metrics. Additionally, we observe a 0.07% enhancement in the METEOR score compared to the base model. Through the application of reinforcement learning optimization, our model achieves a remarkable CIDEr-D score of 132.2% on the MS-COCO dataset. This consistently outperforms baseline performance across a comprehensive range of evaluation metrics.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PVDM-YOLOv8l: a solution for reliable pedestrian and vehicle detection in autonomous vehicles under adverse weather conditions PVDM-YOLOv8l:在恶劣天气条件下自动驾驶车辆可靠检测行人和车辆的解决方案
IF 3.6 4区 计算机科学
Multimedia Tools and Applications Pub Date : 2024-09-16 DOI: 10.1007/s11042-024-20219-6
Noor Ul Ain Tahir, Zuping Zhang, Muhammad Asim, Sundas Iftikhar, Ahmed A. Abd El-Latif
{"title":"PVDM-YOLOv8l: a solution for reliable pedestrian and vehicle detection in autonomous vehicles under adverse weather conditions","authors":"Noor Ul Ain Tahir, Zuping Zhang, Muhammad Asim, Sundas Iftikhar, Ahmed A. Abd El-Latif","doi":"10.1007/s11042-024-20219-6","DOIUrl":"https://doi.org/10.1007/s11042-024-20219-6","url":null,"abstract":"<p>Ensuring the safe navigation of autonomous vehicles in intelligent transportation system depends on their ability to detect pedestrians and vehicles. While transformer-based models for object detection have shown remarkable advancements, accurately identifying pedestrians and vehicles in adverse weather conditions remains a challenging task. Adverse weather introduces image quality degradation, leading to issues such as low contrast, reduced visibility, blurred edges, false detection, misdetection of tiny objects, and other impediments that further complicate the accuracy of detection. This paper introduces a novel Pedestrian and Vehicle Detection Model under adverse weather conditions, denoted as PVDM-YOLOv8l. In our proposed model, we first incorporate the Swin-Transformer method, which is designed for global extraction of feature of small objects to identify in poor visibility, into the YOLOv8l backbone structure. To enhance detection accuracy and address the impact of inaccurate features on recognition performance, CBAM is integrated between the neck and head networks of YOLOv8l, aiming to gather crucial information and obtain essential data. Finally, we adopted the loss function Wise-IOU v3. This function was implemented to mitigate the adverse effects of low-quality instances by minimizing negative gradients. Additionally, we enhanced and augmented the DAWN dataset and created a custom dataset, named DAWN2024, to cater to the specific requirements of our study. To verify the superiority of PVDM-YOLOV8l, its performance was compared against several commonly used object detectors, including YOLOv3, YOLOv3-tiny, YOLOv3-spp, YOLOv5, YOLOv6, and all the versions of YOLOv8 (n, m, s, l, and x) and some traditional models. The experimental results demonstrate that our proposed model achieved a 6.6%, 5.4%, 6%, and 5.1% improvement in precision, recall, F1-score and mean Average Precision (mAP) on the custom DAWN2024 dataset. This substantial improvement in accuracy indicates a significant leap in the capability of our model to detect pedestrians and vehicles under adverse weather conditions, which is crucial for the safe navigation of autonomous vehicles.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-dimensional convolution transformer for group activity recognition 用于群体活动识别的多维卷积变换器
IF 3.6 4区 计算机科学
Multimedia Tools and Applications Pub Date : 2024-09-16 DOI: 10.1007/s11042-024-19973-4
Dongli Wang, Xiaolin Zhu, Jinfu Liu, Zixin Zhang, Yan Zhou
{"title":"Multi-dimensional convolution transformer for group activity recognition","authors":"Dongli Wang, Xiaolin Zhu, Jinfu Liu, Zixin Zhang, Yan Zhou","doi":"10.1007/s11042-024-19973-4","DOIUrl":"https://doi.org/10.1007/s11042-024-19973-4","url":null,"abstract":"<p>Group activity recognition, which aims to understand the activity performed by a group of people, has attracted growing attention in the realm of computer vision over the past decade. In this paper, we propose a novel multi-dimensional convolution Transformer network for group activity recognition, which not only models spatial-temporal feature representations, but also combines channel information to analyze the spatial-temporal dependencies of individual actors. Specifically, we first construct a multi-scale feature extraction module in the feature extraction stage, which can exploit discriminative high-level and low-level feature representations. The multi-branching strategy combined with the dilated convolution can further capture multi-scale feature information in complex group scenarios. Then, to construct the inter-dependence among involved actors from different dimensions, we design a multi-dimensional convolution Transformer in the relational reasoning stage, which consists of the following three parts: a channel attention module, a spatial-temporal convolutional Transformer, and a spatial-temporal attention module. Finally, the final activity recognition result is obtained by using a softmax classifier. Extensive experiments on two public GAR datasets demonstrate that the recognition accuracy on the Volleyball Dataset and Collective Activity Dataset can reach 92.8% and 96.1%, respectively, which is a significant improvement compared with the mainstream methods in recent years.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced security in lossless audio encryption using zigzag scrambling, DNA coding, SHA-256, and hopfield networks: a practical vlc system implementation 使用之字形加扰、DNA 编码、SHA-256 和 hopfield 网络增强无损音频加密的安全性:一个实用的 vlc 系统实现方案
IF 3.6 4区 计算机科学
Multimedia Tools and Applications Pub Date : 2024-09-16 DOI: 10.1007/s11042-024-20196-w
Sorel Bagio Nono Fotso, William Nodem Atchoffo, Armand C. Nzeukou, Jimmi Hervé Talla Mbé
{"title":"Enhanced security in lossless audio encryption using zigzag scrambling, DNA coding, SHA-256, and hopfield networks: a practical vlc system implementation","authors":"Sorel Bagio Nono Fotso, William Nodem Atchoffo, Armand C. Nzeukou, Jimmi Hervé Talla Mbé","doi":"10.1007/s11042-024-20196-w","DOIUrl":"https://doi.org/10.1007/s11042-024-20196-w","url":null,"abstract":"<p>This paper presents a novel lossless audio encryption algorithm based on a modified zigzag scrambling technique, SHA-256, DNA coding, cipher block chaining (CBC) mode, and the delayed Hopfield neural network (HNN). The algorithm mainly includes the scrambling and diffusion stages. In the scrambling stage, the audio signal is converted into a square matrix on which the modified zigzag scrambling technique is applied. Then follows the confusion stage in which bit-level permutation, DNA coding, and CBC mode are applied successively. Besides, the delayed HNN serving in the encryption process is controlled by the plain audio signal through the hash function SHA-256 to resist differential attack. The proposed algorithm has been assessed on ten audio signals using more than fourteen performance measures. Compare to the state-of-the-art, the obtained results show better performances. Indeed, higher resistance to differential attack is obtained; this is seen through higher values of number of sample change rate (NSCR) and unified average changing intensity (UACI). Also, more disorder is detected in the encrypted audio signal through higher values of the information entropy. Furthermore, the proposed algorithm possesses a larger key space arising from the high number of parameters of the delayed HNN, which results in a higher resistance to brute force attacks. A real-life implementation of the proposed encryption technique is achieved with a visible light communication (VLC) system; this highlights its feasibility and effectiveness in securing optical wireless communication systems.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Voxel completion and 3D asymmetrical convolution networks for Lidar semantic segmentation 用于激光雷达语义分割的体素补全和三维非对称卷积网络
IF 3.6 4区 计算机科学
Multimedia Tools and Applications Pub Date : 2024-09-16 DOI: 10.1007/s11042-024-19975-2
Yan Zhou, Jingwei Liu, Jianxun Li, Haibin Zhou
{"title":"Voxel completion and 3D asymmetrical convolution networks for Lidar semantic segmentation","authors":"Yan Zhou, Jingwei Liu, Jianxun Li, Haibin Zhou","doi":"10.1007/s11042-024-19975-2","DOIUrl":"https://doi.org/10.1007/s11042-024-19975-2","url":null,"abstract":"<p>The point cloud data collected by LiDAR is large in scale and contains rich spatial structure detail information, through the collection and labeling of LiDAR data, the automatic driving system can obtain detailed information about the environment around the vehicle. Due to lack of sufficient laser points, some methods transform the point cloud to dense representations such as multi-view or voxelized grids for processing, ignoring the information loss problem caused by the LiDAR imaging characteristics as well as the point cloud transformations, which leads to a degradation of the segmentation performance. In this work, We investigate a 3D semantic segmentation scheme with only LiDAR inputs, called voxel completion and 3D asymmetric convolution network. We propose a voxel completion sub-network to improve the feature extraction capability of the network by enlarging the receptive field and using multi-scale feature extraction to reduce the empty units in the voxels and obtain more complete voxel features. In addition, due to the presence of a large number of cubic objects in the autopilot scenario, to better match the autopilot scenario, we propose a 3D asymmetric convolution network that includes three components: a 3D residual block, an asymmetric convolution block, and a context module. These components are combined together to explore 3D geometric patterns, which can maintain their intrinsic properties and improve the performance of the network. Extensive experiments on the SemanticKITTI and nuScenes benchmark datasets demonstrate the superiority of the approach. For example, on the nuScenes validation set, our method outperforms the state-of-the-art method by 0.3% in mIoU.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An effective binary dynamic grey wolf optimization algorithm for the 0-1 knapsack problem 0-1 "knapsack "问题的有效二元动态灰狼优化算法
IF 3.6 4区 计算机科学
Multimedia Tools and Applications Pub Date : 2024-09-16 DOI: 10.1007/s11042-024-20121-1
Feyza Erdoğan, Murat Karakoyun, Şaban Gülcü
{"title":"An effective binary dynamic grey wolf optimization algorithm for the 0-1 knapsack problem","authors":"Feyza Erdoğan, Murat Karakoyun, Şaban Gülcü","doi":"10.1007/s11042-024-20121-1","DOIUrl":"https://doi.org/10.1007/s11042-024-20121-1","url":null,"abstract":"<p>Metaheuristic algorithms are recommended and frequently used methods for solving optimization problems. Today, it has been adapted to many challenging problems and its successes have been identified. The grey wolf optimizer (GWO) is one of the most advanced metaheuristics. Because of the advantages it provides, GWO has been applied to solve many different problems. In this study, a new variant of GWO, the Binary Dynamic Grey Wolf Optimizer (BDGWO), is proposed for the solution of binary optimization problems. The main contributions of BDGWO compared to other binary GWO variants are that it uses the XOR bitwise operation to binarize and is based on the dynamic coefficient method developed to determine the effect of the three dominant wolves (alpha, beta, and delta) in the algorithm. BDGWO is a simple, feasible, and successful method that strikes a balance between local search and global search in solving binary optimization problems. To determine the success and accuracy of the proposed BDGWO, it was tested on the 0-1 knapsack problem (0-1 KP), which is classified as an NP-Hard problem. The BDGWO was compared with 17 different binary methods across a total of 55 data sets from three different studies published in the last four years. The Friedman test was applied to interpret the experimental results more easily and to evaluate the algorithm results statistically. As a result of the experiments, it has been proven that the BDGWO is an effective and successful method in accordance with its purpose.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DE-DFKD: diversity enhancing data-free knowledge distillation DE-DFKD:多样性增强型无数据知识提炼
IF 3.6 4区 计算机科学
Multimedia Tools and Applications Pub Date : 2024-09-14 DOI: 10.1007/s11042-024-20193-z
Yanni Liu, Ayong Ye, Qiulin Chen, Yuexin Zhang, Jianwei Chen
{"title":"DE-DFKD: diversity enhancing data-free knowledge distillation","authors":"Yanni Liu, Ayong Ye, Qiulin Chen, Yuexin Zhang, Jianwei Chen","doi":"10.1007/s11042-024-20193-z","DOIUrl":"https://doi.org/10.1007/s11042-024-20193-z","url":null,"abstract":"<p>Data-Free Knowledge Distillation (DFKD) can be used to train students using synthetic data, when the original dataset of the teacher network is not accessible. However, existing studies mainly focus on how to use the prior knowledge of the teacher network to synthesize data, ignoring the lack of diversity of synthesized data, which leads to the inability of the student network to learn the real data distribution and low robustness. In this paper, we propose a Diversity-Enhanced Data-Free Knowledge Distillation (DE-DFKD) method based on the idea of generative image modelling, which introduces conditional generative networks and metric learning to solve the problem of class imbalance and single intra-class data distribution in synthetic datasets. The experimental results show that DE-DFKD synthesizes better quality data on MNIST, CIFAR-10, and CIFAR-100 datasets with Frechet Inception Distance (FID) values of 51.79, 60.25, and 50.1, respectively, and higher accuracy of student networks compared with existing schemes.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Tasmanian Devil Optimization algorithm based efficient task scheduling for big data application in a cloud computing environment 基于自适应塔斯马尼亚魔鬼优化算法的高效任务调度,适用于云计算环境中的大数据应用
IF 3.6 4区 计算机科学
Multimedia Tools and Applications Pub Date : 2024-09-14 DOI: 10.1007/s11042-024-19887-1
Ashis Kumar Mishra, Subasis Mohapatra, Pradip Kumar Sahu
{"title":"Adaptive Tasmanian Devil Optimization algorithm based efficient task scheduling for big data application in a cloud computing environment","authors":"Ashis Kumar Mishra, Subasis Mohapatra, Pradip Kumar Sahu","doi":"10.1007/s11042-024-19887-1","DOIUrl":"https://doi.org/10.1007/s11042-024-19887-1","url":null,"abstract":"<p>One of the most difficult issues in cloud computing is scheduling tasks on appropriate resources on the cloud.This is significant because multiple tasks may need to be efficiently scheduled across different virtual machines to maximize resource utilization and minimize makespan. As a result, various efforts have been made to use metaheuristic algorithms to tackle the task scheduling problem. However, these techniques may occasionally experience early convergence and be trapped in local search. This research proposes a multi-objective-based task scheduling in cloud computing for big data applications to address these issues. To accomplish this goal, the adaptive Tasmanian Devil Optimization (ATDO) method is created in this study, with a focus on resolving challenging optimization issues. Following that, the opposition-based learning technique (OBL) is combined with TDO to maintain the population diversity and improve convergence on the ideal answer. In addition, cost, makespan,and resource utilization are taken into account when designing the multi-objective function (MOF). The proposed strategy included efficient solution representation, efficient fitness function derivation, TDO, and OBL operators. The effectiveness of the strategy is examined using several evaluation metrics, and its efficacy is compared with those of other approaches.The proposed method takes a minimum time of 2134 ms for scheduling 1000 tasks and 20.97 degree of imbalance.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parkinson's disease diagnosis by voice data using particle swarm optimization-extreme learning machine approach 利用粒子群优化-极端学习机方法通过语音数据诊断帕金森病
IF 3.6 4区 计算机科学
Multimedia Tools and Applications Pub Date : 2024-09-14 DOI: 10.1007/s11042-024-20108-y
Musatafa Abbas Abbood Albadr, Masri Ayob, Sabrina Tiun, Raad Z. Homod, Fahad Taha AL-Dhief, Mohammed Hasan Mutar
{"title":"Parkinson's disease diagnosis by voice data using particle swarm optimization-extreme learning machine approach","authors":"Musatafa Abbas Abbood Albadr, Masri Ayob, Sabrina Tiun, Raad Z. Homod, Fahad Taha AL-Dhief, Mohammed Hasan Mutar","doi":"10.1007/s11042-024-20108-y","DOIUrl":"https://doi.org/10.1007/s11042-024-20108-y","url":null,"abstract":"<p>Various speech processing approaches (e.g., acoustic feature extraction techniques) and Machine Learning (ML) algorithms have been applied to diagnosing Parkinson's disease (PD). However, the majority of these researches have used conventional techniques which obtain a low accuracy rate in diagnosing PD and still need further improvement. Particle Swarm Optimization-Extreme Learning Machine (PSO-ELM), one of the most recent and effective ML techniques, could be considered an accurate strategy in the classification process but has not been applied to solve the problem of PD diagnosis. Thus, in order to enhance the precision of the PD diagnosing, this study employs the PSO-ELM classifier and examines how well it performs on seven feature extraction techniques (basic features, WT (Wavelet Transform), MFCC (Mel Frequency Cepstral Coefficients), bandwidth + formant, intensity parameters, TQWT (Tunable Q-factor Wavelet Transform), and vocal fold features). The PSO-ELM approach has the capability to <b>a)</b> prevents overfitting, <b>b)</b> solve the binary and multi class classification issues, and <b>c)</b> perform like a kernel-based support vector machine with a structure of neural network. Therefore, if the combination of PSO-ELM classifier and appropriate feature extraction technique can improve learning performance, this combination can produce an effective method for identifying PD. In this study, the PD's voice samples have been taken from the Parkinson’s Disease Classification Benchmark Dataset. To discover a useful feature extraction technique to couple with the PSO-ELM classifier, we applied PSO-ELM to each extracted feature with the utilisation of unbalanced and balanced dataset. According to the experimental results, the MFCC features assist the PSO-ELM classifier to attaining its greatest accuracy, up to 97.35% using unbalanced dataset and 100.00% using balanced dataset. This shows that combining PSO-ELM with MFCC can improve learning performance, ultimately creating an effective method for identifying PD.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Principal component fusion based unexposed biological feature enhancement of fundus images 基于主成分融合的眼底图像未曝光生物特征增强技术
IF 3.6 4区 计算机科学
Multimedia Tools and Applications Pub Date : 2024-09-14 DOI: 10.1007/s11042-024-20110-4
Neha Singh, Ashish Kumar Bhandari
{"title":"Principal component fusion based unexposed biological feature enhancement of fundus images","authors":"Neha Singh, Ashish Kumar Bhandari","doi":"10.1007/s11042-024-20110-4","DOIUrl":"https://doi.org/10.1007/s11042-024-20110-4","url":null,"abstract":"<p>In the field of ophthalmology, digital images play an important role for automatic detection of various kind of eye diseases. Digital images in the field image enhancement are the first stage to assisting ophthalmologist for diagnosis. As a result, various algorithms, and methods for the enhancement of retinal images have been developed, which may face obstacles that are common in augmentation processes, such as false edges and weak illuminated that obscure image particulars. To eliminate such issues, this paper projected a novel framework for unexposed retinal image. The proposed paper uses multiscale Gaussian function for estimation of illumination layer from unexposed color retinal image and then it is corrected by gamma method. Further to this, the principal component analysis (PCA) is utilized here to generate fused enhance result for unexposed retinal images. Then, contrast limited technique is employed here for further edge and contextual details improvement. When compared to several enhancement-based state-of-the-art procedures, experimental results show that the suggested method produces results with good contrast and brightness. The significance of the proposed method that this method may help ophthalmologists screen for unexposed retinal illnesses more efficiently and build better automated image analysis for healthcare diagnosis.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信