CAAI Transactions on Intelligence Technology最新文献

筛选
英文 中文
Cross-Domain Graph Anomaly Detection via Graph Transfer and Graph Decouple 基于图传输和图解耦的跨域图异常检测
IF 7.3 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2025-05-13 DOI: 10.1049/cit2.70014
Changqin Huang, Xinxing Shi, Chengling Gao, Qintai Hu, Xiaodi Huang, Qionghao Huang, Ali Anaissi
{"title":"Cross-Domain Graph Anomaly Detection via Graph Transfer and Graph Decouple","authors":"Changqin Huang,&nbsp;Xinxing Shi,&nbsp;Chengling Gao,&nbsp;Qintai Hu,&nbsp;Xiaodi Huang,&nbsp;Qionghao Huang,&nbsp;Ali Anaissi","doi":"10.1049/cit2.70014","DOIUrl":"10.1049/cit2.70014","url":null,"abstract":"<p>Cross-domain graph anomaly detection (CD-GAD) is a promising task that leverages knowledge from a labelled source graph to guide anomaly detection on an unlabelled target graph. CD-GAD classifies anomalies as unique or common based on their presence in both the source and target graphs. However, existing models often fail to fully explore domain-unique knowledge of the target graph for detecting unique anomalies. Additionally, they tend to focus solely on node-level differences, overlooking structural-level differences that provide complementary information for common anomaly detection. To address these issues, we propose a novel method, Synthetic Graph Anomaly Detection via Graph Transfer and Graph Decouple (GTGD), which effectively detects common and unique anomalies in the target graph. Specifically, our approach ensures deeper learning of domain-unique knowledge by decoupling the reconstruction graphs of common and unique features. Moreover, we simultaneously consider node-level and structural-level differences by transferring node and edge information from the source graph to the target graph, enabling comprehensive domain-common knowledge representation. Anomalies are detected using both common and unique features, with their synthetic score serving as the final result. Extensive experiments demonstrate the effectiveness of our approach, improving an average performance by 12.6<span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>%</mi>\u0000 </mrow>\u0000 <annotation> $%$</annotation>\u0000 </semantics></math> on the AUC-PR compared to state-of-the-art methods.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 4","pages":"1089-1103"},"PeriodicalIF":7.3,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70014","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144910129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unified Neural Lexical Analysis Via Two-Stage Span Tagging 基于两阶段跨度标注的统一神经词法分析
IF 7.3 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2025-05-10 DOI: 10.1049/cit2.70015
Yantuan Xian, Yefen Zhu, Zhentao Yu, Yuxin Huang, Junjun Guo, Yan Xiang
{"title":"Unified Neural Lexical Analysis Via Two-Stage Span Tagging","authors":"Yantuan Xian,&nbsp;Yefen Zhu,&nbsp;Zhentao Yu,&nbsp;Yuxin Huang,&nbsp;Junjun Guo,&nbsp;Yan Xiang","doi":"10.1049/cit2.70015","DOIUrl":"10.1049/cit2.70015","url":null,"abstract":"<p>Lexical analysis is a fundamental task in natural language processing, which involves several subtasks, such as word segmentation (WS), part-of-speech (POS) tagging, and named entity recognition (NER). Recent works have shown that taking advantage of relatedness between these subtasks can be beneficial. This paper proposes a unified neural framework to address these subtasks simultaneously. Apart from the sequence tagging paradigm, the proposed method tackles the multitask lexical analysis via two-stage sequence span classification. Firstly, the model detects the word and named entity boundaries by multi-label classification over character spans in a sentence. Then, the authors assign POS labels and entity labels for words and named entities by multi-class classification, respectively. Furthermore, a Gated Task Transformation (GTT) is proposed to encourage the model to share valuable features between tasks. The performance of the proposed model was evaluated on Chinese and Thai public datasets, demonstrating state-of-the-art results.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 4","pages":"1254-1267"},"PeriodicalIF":7.3,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70015","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144910122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extrapolation Reasoning on Temporal Knowledge Graphs via Temporal Dependencies Learning 基于时间依赖学习的时间知识图外推推理
IF 7.3 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2025-05-06 DOI: 10.1049/cit2.70013
Ye Wang, Binxing Fang, Shuxian Huang, Kai Chen, Yan Jia, Aiping Li
{"title":"Extrapolation Reasoning on Temporal Knowledge Graphs via Temporal Dependencies Learning","authors":"Ye Wang,&nbsp;Binxing Fang,&nbsp;Shuxian Huang,&nbsp;Kai Chen,&nbsp;Yan Jia,&nbsp;Aiping Li","doi":"10.1049/cit2.70013","DOIUrl":"10.1049/cit2.70013","url":null,"abstract":"<p>Extrapolation on Temporal Knowledge Graphs (TKGs) aims to predict future knowledge from a set of historical Knowledge Graphs in chronological order. The temporally adjacent facts in TKGs naturally form event sequences, called event evolution patterns, implying informative temporal dependencies between events. Recently, many extrapolation works on TKGs have been devoted to modelling these evolutional patterns, but the task is still far from resolved because most existing works simply rely on encoding these patterns into entity representations while overlooking the significant information implied by relations of evolutional patterns. However, the authors realise that the temporal dependencies inherent in the relations of these event evolution patterns may guide the follow-up event prediction to some extent. To this end, a <b><i>T</i></b><i>emporal</i> <b><i>Re</i></b><i>lational Co</i><b><i>n</i></b><i>text-based Temporal</i> <b><i>D</i></b><i>ependencies Learning Network</i> (TRenD) is proposed to explore the temporal context of relations for more comprehensive learning of event evolution patterns, especially those temporal dependencies caused by interactive patterns of relations. Trend incorporates a semantic context unit to capture semantic correlations between relations, and a structural context unit to learn the interaction pattern of relations. By learning the temporal contexts of relations semantically and structurally, the authors gain insights into the underlying event evolution patterns, enabling to extract comprehensive historical information for future prediction better. Experimental results on benchmark datasets demonstrate the superiority of the model.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 3","pages":"815-826"},"PeriodicalIF":7.3,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144502964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy Efficient VM Selection Using CSOA-VM Model in Cloud Data Centers 基于CSOA-VM模型的云数据中心节能虚拟机选择
IF 7.3 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2025-04-30 DOI: 10.1049/cit2.70018
Mandeep Singh Devgan, Tajinder Kumar, Purushottam Sharma, Xiaochun Cheng, Shashi Bhushan, Vishal Garg
{"title":"Energy Efficient VM Selection Using CSOA-VM Model in Cloud Data Centers","authors":"Mandeep Singh Devgan,&nbsp;Tajinder Kumar,&nbsp;Purushottam Sharma,&nbsp;Xiaochun Cheng,&nbsp;Shashi Bhushan,&nbsp;Vishal Garg","doi":"10.1049/cit2.70018","DOIUrl":"10.1049/cit2.70018","url":null,"abstract":"<p>The cloud data centres evolved with an issue of energy management due to the constant increase in size, complexity and enormous consumption of energy. Energy management is a challenging issue that is critical in cloud data centres and an important concern of research for many researchers. In this paper, we proposed a cuckoo search (CS)-based optimisation technique for the virtual machine (VM) selection and a novel placement algorithm considering the different constraints. The energy consumption model and the simulation model have been implemented for the efficient selection of VM. The proposed model CSOA-VM not only lessens the violations at the service level agreement (SLA) level but also minimises the VM migrations. The proposed model also saves energy and the performance analysis shows that energy consumption obtained is 1.35 kWh, SLA violation is 9.2 and VM migration is about 268. Thus, there is an improvement in energy consumption of about 1.8% and a 2.1% improvement (reduction) in violations of SLA in comparison to existing techniques.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 4","pages":"1217-1234"},"PeriodicalIF":7.3,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144910426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Processing Water-Medium Spinal Endoscopic Images Based on Dual Transmittance 基于双透射的水介质脊柱内窥镜图像处理
IF 7.3 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2025-04-30 DOI: 10.1049/cit2.70016
Ning Hu, Qing Zhang
{"title":"Processing Water-Medium Spinal Endoscopic Images Based on Dual Transmittance","authors":"Ning Hu,&nbsp;Qing Zhang","doi":"10.1049/cit2.70016","DOIUrl":"10.1049/cit2.70016","url":null,"abstract":"<p>Real-time water-medium endoscopic images can assist doctors in performing operations such as tissue cleaning and nucleus pulpous removal. During medical operating procedures, it is inevitable that tissue particles, debris and other contaminants will be suspended within the viewing area, resulting in blurred images and the loss of surface details in biological tissues. Currently, few studies have focused on enhancing such endoscopic images. This paper proposes a water-medium endoscopic image processing method based on dual transmittance in accordance with the imaging characteristics of spinal endoscopy. By establishing an underwater imaging model for spinal endoscopy, we estimate the transmittance of the endoscopic images based on the boundary constraints and local image contrast. The two transmittances are then fused and combined with transmittance maps and ambient light estimations to restore the images before attenuation, ultimately enhancing the details and texture of the images. Experiments comparing classical image enhancement algorithms demonstrate that the proposed algorithm could effectively improve the quality of spinal endoscopic images.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 3","pages":"678-688"},"PeriodicalIF":7.3,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70016","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144503034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Syn-Aug: An Effective and General Synchronous Data Augmentation Framework for 3D Object Detection Syn-Aug:一种用于三维目标检测的有效和通用同步数据增强框架
IF 7.3 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2025-04-21 DOI: 10.1049/cit2.70001
Huaijin Liu, Jixiang Du, Yong Zhang, Hongbo Zhang, Jiandian Zeng
{"title":"Syn-Aug: An Effective and General Synchronous Data Augmentation Framework for 3D Object Detection","authors":"Huaijin Liu,&nbsp;Jixiang Du,&nbsp;Yong Zhang,&nbsp;Hongbo Zhang,&nbsp;Jiandian Zeng","doi":"10.1049/cit2.70001","DOIUrl":"10.1049/cit2.70001","url":null,"abstract":"<p>Data augmentation plays an important role in boosting the performance of 3D models, while very few studies handle the 3D point cloud data with this technique. Global augmentation and cut-paste are commonly used augmentation techniques for point clouds, where global augmentation is applied to the entire point cloud of the scene, and cut-paste samples objects from other frames into the current frame. Both types of data augmentation can improve performance, but the cut-paste technique cannot effectively deal with the occlusion relationship between the foreground object and the background scene and the rationality of object sampling, which may be counterproductive and may hurt the overall performance. In addition, LiDAR is susceptible to signal loss, external occlusion, extreme weather and other factors, which can easily cause object shape changes, while global augmentation and cut-paste cannot effectively enhance the robustness of the model. To this end, we propose Syn-Aug, a synchronous data augmentation framework for LiDAR-based 3D object detection. Specifically, we first propose a novel rendering-based object augmentation technique (Ren-Aug) to enrich training data while enhancing scene realism. Second, we propose a local augmentation technique (Local-Aug) to generate local noise by rotating and scaling objects in the scene while avoiding collisions, which can improve generalisation performance. Finally, we make full use of the structural information of 3D labels to make the model more robust by randomly changing the geometry of objects in the training frames. We verify the proposed framework with four different types of 3D object detectors. Experimental results show that our proposed Syn-Aug significantly improves the performance of various 3D object detectors in the KITTI and nuScenes datasets, proving the effectiveness and generality of Syn-Aug. On KITTI, four different types of baseline models using Syn-Aug improved mAP by 0.89%, 1.35%, 1.61% and 1.14% respectively. On nuScenes, four different types of baseline models using Syn-Aug improved mAP by 14.93%, 10.42%, 8.47% and 6.81% respectively. The code is available at https://github.com/liuhuaijjin/Syn-Aug.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 3","pages":"912-928"},"PeriodicalIF":7.3,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144503119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WaveLiteDehaze-Network: A Low-Parameter Wavelet-Based Method for Real-Time Dehazing wavitedehaze - network:一种基于小波的低参数实时去雾方法
IF 7.3 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2025-04-19 DOI: 10.1049/cit2.70011
Ali Murtaza, Uswah Khairuddin, Ahmad ’Athif Mohd Faudzi, Kazuhiko Hamamoto, Yang Fang, Zaid Omar
{"title":"WaveLiteDehaze-Network: A Low-Parameter Wavelet-Based Method for Real-Time Dehazing","authors":"Ali Murtaza,&nbsp;Uswah Khairuddin,&nbsp;Ahmad ’Athif Mohd Faudzi,&nbsp;Kazuhiko Hamamoto,&nbsp;Yang Fang,&nbsp;Zaid Omar","doi":"10.1049/cit2.70011","DOIUrl":"10.1049/cit2.70011","url":null,"abstract":"<p>Although the image dehazing problem has received considerable attention over recent years, the existing models often prioritise performance at the expense of complexity, making them unsuitable for real-world applications, which require algorithms to be deployed on resource constrained-devices. To address this challenge, we propose WaveLiteDehaze-Network (WLD-Net), an end-to-end dehazing model that delivers performance comparable to complex models while operating in real time and using significantly fewer parameters. This approach capitalises on the insight that haze predominantly affects low-frequency information. By exclusively processing the image in the frequency domain using discrete wavelet transform (DWT), we segregate the image into high and low frequencies and process them separately. This allows us to preserve high-frequency details and recover low-frequency components affected by haze, distinguishing our method from existing approaches that use spatial domain processing as the backbone, with DWT serving as an auxiliary component. DWT is applied at multiple levels for better information retention while also accelerating computation by downsampling feature maps. Subsequently, a learning-based fusion mechanism reintegrates the processed frequencies to reconstruct the dehazed image. Experiments show that WLD-Net outperforms other low-parameter models on real-world hazy images and rivals much larger models, achieving the highest PSNR and SSIM scores on the O-Haze dataset. Qualitatively, the proposed method demonstrates its effectiveness in handling a diverse range of haze types, delivering visually pleasing results and robust performance, while also generalising well across different scenarios. With only 0.385 million parameters (more than 100 times smaller than comparable dehazing methods), WLD-Net processes 1024 × 1024 images in just 0.045 s, highlighting its applicability across various real-world scenarios. The code is available at https://github.com/AliMurtaza29/WLD-Net.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 4","pages":"1033-1048"},"PeriodicalIF":7.3,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70011","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144909978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-Stage Early Exiting From Globality Towards Reliability 从全局到可靠性的两阶段早期退出
IF 7.3 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2025-04-11 DOI: 10.1049/cit2.70010
Jianing He, Qi Zhang, Hongyun Zhang, Duoqian Miao
{"title":"Two-Stage Early Exiting From Globality Towards Reliability","authors":"Jianing He,&nbsp;Qi Zhang,&nbsp;Hongyun Zhang,&nbsp;Duoqian Miao","doi":"10.1049/cit2.70010","DOIUrl":"10.1049/cit2.70010","url":null,"abstract":"<p>Early exiting has shown significant potential in accelerating the inference of pre-trained language models (PLMs) by allowing easy samples to exit from shallow layers. However, existing early exiting methods primarily rely on local information from individual samples to estimate prediction uncertainty for making exiting decisions, overlooking the global information provided by the sample population. This impacts the estimation of prediction uncertainty, compromising the reliability of exiting decisions. To remedy this, inspired by principal component analysis (PCA), the authors define a residual score to capture the deviation of features from the principal space of the sample population, providing a global perspective for estimating prediction uncertainty. Building on this, a two-stage exiting strategy is proposed that integrates global information from residual scores with local information from energy scores at both the decision and feature levels. This strategy incorporates three-way decisions to enable more reliable exiting decisions for boundary region samples by delaying judgement. Extensive experiments on the GLUE benchmark validate that the method achieves an average speed-up ratio of 2.17× across all tasks with minimal performance degradation. Additionally, it surpasses the state-of-the-art E-LANG by <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mn>11</mn>\u0000 <mi>%</mi>\u0000 </mrow>\u0000 <annotation> $11%$</annotation>\u0000 </semantics></math> in model acceleration, along with a performance improvement of 0.6 points, demonstrating a better performance-efficiency trade-off.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 4","pages":"1019-1032"},"PeriodicalIF":7.3,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144910123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sophisticated Ensemble Deep Learning Approaches for Multilabel Retinal Disease Classification in Medical Imaging 医学影像中多标签视网膜疾病分类的复杂集成深度学习方法
IF 7.3 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2025-04-09 DOI: 10.1049/cit2.70012
Asghar Amir, Tariqullah Jan, Mohammad Haseeb Zafar, Shadan Khan Khattak
{"title":"Sophisticated Ensemble Deep Learning Approaches for Multilabel Retinal Disease Classification in Medical Imaging","authors":"Asghar Amir,&nbsp;Tariqullah Jan,&nbsp;Mohammad Haseeb Zafar,&nbsp;Shadan Khan Khattak","doi":"10.1049/cit2.70012","DOIUrl":"10.1049/cit2.70012","url":null,"abstract":"<p>This paper introduces a novel ensemble Deep learning (DL)-based Multi-Label Retinal Disease Classification (MLRDC) system, known for its high accuracy and efficiency. Utilising a stacking ensemble approach, and integrating DenseNet201, EfficientNetB4, EfficientNetB3 and EfficientNetV2S models, exceptional performance in retinal disease classification is achieved. The proposed MLRDC model, leveraging DL as the meta-model, outperforms individual base detectors, with DenseNet201 and EfficientNetV2S achieving an accuracy of 96.5%, precision of 98.6%, recall of 97.1%, and F1 score of 97.8%. Weighted multilabel classifiers in the ensemble exhibit an average accuracy of 90.6%, precision of 98.3%, recall of 91.2%, and F1 score of 94.6%, whereas unweighted models achieve an average accuracy of 90%, precision of 98.6%, recall of 93.1%, and F1 score of 95.7%. Employing Logistic Regression (LR) as the meta-model, the proposed MLRDC system achieves an accuracy of 93.5%, precision of 98.2%, recall of 93.9%, and F1 score of 96%, with a minimal loss of 0.029. These results highlight the superiority of the proposed model over benchmark state-of-the-art ensembles, emphasising its practical applicability in medical image classification.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 4","pages":"1159-1173"},"PeriodicalIF":7.3,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70012","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144910072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geometry-Enhanced Implicit Function for Detailed Clothed Human Reconstruction With RGB-D Input 基于RGB-D输入的几何增强隐式人体细节重建
IF 7.3 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2025-04-03 DOI: 10.1049/cit2.70009
Pengpeng Liu, Zhi Zeng, Qisheng Wang, Min Chen, Guixuan Zhang
{"title":"Geometry-Enhanced Implicit Function for Detailed Clothed Human Reconstruction With RGB-D Input","authors":"Pengpeng Liu,&nbsp;Zhi Zeng,&nbsp;Qisheng Wang,&nbsp;Min Chen,&nbsp;Guixuan Zhang","doi":"10.1049/cit2.70009","DOIUrl":"10.1049/cit2.70009","url":null,"abstract":"<p>Realistic human reconstruction embraces an extensive range of applications as depth sensors advance. However, current state-of-the-art methods with RGB-D input still suffer from artefacts, such as noisy surfaces, non-human shapes, and depth ambiguity, especially for the invisible parts. The authors observe the main issue is the lack of geometric semantics without using depth input priors fully. This paper focuses on improving the representation ability of implicit function, exploring an effective method to utilise depth-related semantics effectively and efficiently. The proposed geometry-enhanced implicit function enhances the geometric semantics with the extra voxel-aligned features from point clouds, promoting the completion of missing parts for unseen regions while preserving the local details on the input. For incorporating multi-scale pixel-aligned and voxel-aligned features, the authors use the Squeeze-and-Excitation attention to capture and fully use channel interdependencies. For the multi-view reconstruction, the proposed depth-enhanced attention explicitly excites the network to “sense” the geometric structure for a more reasonable feature aggregation. Experiments and results show that our method outperforms current RGB and depth-based SOTA methods on the challenging data from Twindom and Thuman3.0, and achieves a detailed and completed human reconstruction, balancing performance and efficiency well.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 3","pages":"858-870"},"PeriodicalIF":7.3,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70009","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144502992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信