CAAI Transactions on Intelligence Technology最新文献_第2页

Improving 3D Object Detection in Neural Radiance Fields With Channel Attention 利用通道关注改进神经辐射场中的三维目标检测

IF 7.3 2区计算机科学

CAAI Transactions on Intelligence Technology Pub Date : 2025-07-10 DOI: 10.1049/cit2.70045

Minling Zhu, Yadong Gong, Dongbing Gu, Chunwei Tian

引用次数: 0

Deep Learning Approach for Automated Estimation of 3D Vertebral Orientation of the Lumbar Spine 腰椎三维椎体方向自动估计的深度学习方法

IF 7.3 2区计算机科学

CAAI Transactions on Intelligence Technology Pub Date : 2025-07-10 DOI: 10.1049/cit2.70033

Nanfang Xu, Shanshan Liu, Yuepeng Chen, Kailai Zhang, Chenyi Guo, Cheng Zhang, Fei Xu, Qifeng Lan, Wanyi Fu, Xingyu Zhou, Bo Zhao, Aodong He, Xiangling Fu, Ji Wu, Weishi Li

{"title":"Deep Learning Approach for Automated Estimation of 3D Vertebral Orientation of the Lumbar Spine","authors":"Nanfang Xu, Shanshan Liu, Yuepeng Chen, Kailai Zhang, Chenyi Guo, Cheng Zhang, Fei Xu, Qifeng Lan, Wanyi Fu, Xingyu Zhou, Bo Zhao, Aodong He, Xiangling Fu, Ji Wu, Weishi Li","doi":"10.1049/cit2.70033","DOIUrl":"https://doi.org/10.1049/cit2.70033","url":null,"abstract":"Lumbar degenerative disc diseases constitute a major contributor to lower back pain. In pursuit of an enhanced understanding of lumbar degenerative pathology and the development of more effective treatment modalities, the application of precise measurement techniques for lumbar segment kinematics is imperative. This study aims to pioneer a novel automated lumbar spine orientation estimation method using deep learning techniques, to facilitate the automatic 2D–3D pre-registration of the lumbar spine during physiological movements, to enhance the efficiency of image registration and the accuracy of spinal segment kinematic measurements. A total of 12 asymptomatic volunteers were enrolled and captured in 2 oblique views with 7 different postures. Images were used for deep learning model development training and evaluation. The model was composed of a segmentation module using Mask R-CNN and an estimation module using ResNet50 architecture with a Squeeze-and-Excitation module. The cosine value of the angle between the prediction vector and the vector of ground truth was used to quantify the model performance. Data from another two prospective recruited asymptomatic volunteers were used to compare the time cost between model-assisted registration and manual registration without a model. The cosine values of vector deviation angles at three axes in the cartesian coordinate system were 0.9667 ± 0.004, 0.9593 ± 0.0047 and 0.9828 ± 0.0025, respectively. The value of the angular deviation between the intermediate vector obtained by utilising the three direction vectors and ground truth was 10.7103 ± 0.7466. Results show the consistency and reliability of the model's predictions across different experiments and axes and demonstrate that our approach significantly reduces the registration time (3.47 ± 0.90 min vs. 8.10 ± 1.60 min, p < 0.001), enhances the efficiency, and expands its broader utilisation of clinical research about kinematic measurements.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 5","pages":"1306-1319"},"PeriodicalIF":7.3,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70033","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

VSMI 2 ${text{VSMI}}^{mathbf{2}}$ -PANet: Versatile Scale-Malleable Image Integration and Patch Wise Attention Network With Transformer for Lung Tumour Segmentation Using Multi-Modal Imaging Techniques VSMI 2 ${text{VSMI}}^{mathbf{2}}$ -PANet：基于多模态成像技术的多功能尺度可伸缩图像集成和补丁智能关注网络

IF 7.3 2区计算机科学

CAAI Transactions on Intelligence Technology Pub Date : 2025-07-09 DOI: 10.1049/cit2.70039

Nayef Alqahtani, Arfat Ahmad Khan, Rakesh Kumar Mahendran, Muhammad Faheem

{"title":"VSMI\u0000 2\u0000 \u0000 \u0000 ${text{VSMI}}^{mathbf{2}}$\u0000 -PANet: Versatile Scale-Malleable Image Integration and Patch Wise Attention Network With Transformer for Lung Tumour Segmentation Using Multi-Modal Imaging Techniques","authors":"Nayef Alqahtani, Arfat Ahmad Khan, Rakesh Kumar Mahendran, Muhammad Faheem","doi":"10.1049/cit2.70039","DOIUrl":"https://doi.org/10.1049/cit2.70039","url":null,"abstract":"Lung cancer (LC) is a major cancer which accounts for higher mortality rates worldwide. Doctors utilise many imaging modalities for identifying lung tumours and their severity in earlier stages. Nowadays, machine learning (ML) and deep learning (DL) methodologies are utilised for the robust detection and prediction of lung tumours. Recently, multi modal imaging emerged as a robust technique for lung tumour detection by combining various imaging features. To cope with that, we propose a novel multi modal imaging technique named versatile scale malleable image integration and patch wise attention network (<math>\u0000 <semantics>\u0000 <mrow>\u0000 <msup>\u0000 <mtext>VSMI</mtext>\u0000 <mn>2</mn>\u0000 </msup>\u0000 <mo>−</mo>\u0000 <mtext>PANet</mtext>\u0000 </mrow>\u0000 <annotation> ${text{VSMI}}^{2}-text{PANet}$</annotation>\u0000 </semantics></math>) which adopts three imaging modalities named computed tomography (CT), magnetic resonance imaging (MRI) and single photon emission computed tomography (SPECT). The designed model accepts input from CT and MRI images and passes it to the <math>\u0000 <semantics>\u0000 <mrow>\u0000 <msup>\u0000 <mtext>VSMI</mtext>\u0000 <mn>2</mn>\u0000 </msup>\u0000 </mrow>\u0000 <annotation> ${text{VSMI}}^{2}$</annotation>\u0000 </semantics></math> module that is composed of three sub-modules named image cropping module, scale malleable convolution layer (SMCL) and PANet module. CT and MRI images are subjected to image cropping module in a parallel manner to crop the meaningful image patches and provide them to the SMCL module. The SMCL module is composed of adaptive convolutional layers that investigate those patches in a parallel manner by preserving the spatial information. The output from the SMCL is then fused and provided to the PANet module. The PANet module examines the fused patches by analysing its height, width and channels of the image patch. As a result, it provides an output as high-resolution spatial attention maps indicating the location of suspicious tumours. The high-resolution spatial attention maps are then provided as an input to the backbone module which uses light wave transformer (LWT) for segmenting the lung tumours into three classes, such as normal, benign and malignant. In addition, the LWT also accepts SPECT image as input for capturing the variations precisely to segment the lung tumours. The performance of the proposed model is validated using several performance metrics, such as accuracy, precision, recall, F1-score and AUC curve, and the results show that the proposed work outperforms the existing approaches.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 5","pages":"1376-1393"},"PeriodicalIF":7.3,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70039","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploring a Hybrid Convolutional Framework for Camouflage Target Classification in Land-Based Hyperspectral Images 基于混合卷积框架的陆基高光谱图像伪装目标分类研究

IF 7.3 2区计算机科学

CAAI Transactions on Intelligence Technology Pub Date : 2025-07-09 DOI: 10.1049/cit2.70051

Jiale Zhao, Dan Fang, Jianghu Deng, Jiaju Ying, Yudan Chen, Guanglong Wang, Bing Zhou

{"title":"Exploring a Hybrid Convolutional Framework for Camouflage Target Classification in Land-Based Hyperspectral Images","authors":"Jiale Zhao, Dan Fang, Jianghu Deng, Jiaju Ying, Yudan Chen, Guanglong Wang, Bing Zhou","doi":"10.1049/cit2.70051","DOIUrl":"https://doi.org/10.1049/cit2.70051","url":null,"abstract":"In recent years, camouflage technology has evolved from single-spectral-band applications to multifunctional and multispectral implementations. Hyperspectral imaging has emerged as a powerful technique for target identification due to its capacity to capture both spectral and spatial information. The advancement of imaging spectroscopy technology has significantly enhanced reconnaissance capabilities, offering substantial advantages in camouflaged target classification and detection. However, the increasing spectral similarity between camouflaged targets and their backgrounds has significantly compromised detection performance in specific scenarios. Conventional feature extraction methods are often limited to single, shallow spectral or spatial features, failing to extract deep features and consequently yielding suboptimal classification accuracy. To address these limitations, this study proposes an innovative 3D-2D convolutional neural networks architecture incorporating depthwise separable convolution (DSC) and attention mechanisms (AM). The framework first applies dimensionality reduction to hyperspectral images and extracts preliminary spectral-spatial features. It then employs an alternating combination of 3D and 2D convolutions for deep feature extraction. For target classification, the LogSoftmax function is implemented. The integration of depthwise separable convolution not only enhances classification accuracy but also substantially reduces model parameters. Furthermore, the attention mechanisms significantly improve the network's ability to represent multidimensional features. Extensive experiments were conducted on a custom land-based hyperspectral image dataset. The results demonstrate remarkable classification accuracy: 98.74% for grassland camouflage, 99.13% for dead leaf camouflage and 98.94% for wild grass camouflage. Comparative analysis shows that the proposed framework is outstanding in terms of classification accuracy and robustness for camouflage target classification.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 5","pages":"1559-1572"},"PeriodicalIF":7.3,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70051","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing Brain MRI Super-Resolution Through Multi-Slice Aware Matching and Fusion 多层感知匹配融合增强脑MRI超分辨率

IF 7.3 2区计算机科学

CAAI Transactions on Intelligence Technology Pub Date : 2025-07-04 DOI: 10.1049/cit2.70032

Jie Xiang, Ang Zhao, Xia Li, Xubin Wu, Yanqing Dong, Yan Niu, Xin Wen, Yidi Li

{"title":"Enhancing Brain MRI Super-Resolution Through Multi-Slice Aware Matching and Fusion","authors":"Jie Xiang, Ang Zhao, Xia Li, Xubin Wu, Yanqing Dong, Yan Niu, Xin Wen, Yidi Li","doi":"10.1049/cit2.70032","DOIUrl":"https://doi.org/10.1049/cit2.70032","url":null,"abstract":"In clinical diagnosis, magnetic resonance imaging (MRI) allows different contrast images to be obtained. High-resolution (HR) MRI presents fine anatomical structures, which is important for improving the efficiency of expert diagnosis and realising smart healthcare. However, due to the cost of scanning equipment and the time required for scanning, obtaining an HR brain MRI is quite challenging. Therefore, to improve the quality of images, reference-based super-resolution technology has come into existence. Nevertheless, the existing methods still have some drawbacks: (1) The advantages of different contrast images are not fully utilised. (2) The slice-by-slice scanning nature of magnetic resonance imaging is not considered. (3) The ability to capture contextual information and to match and fuse multi-scale, multi-contrast features is lacking. In this paper, we propose the multi-slice aware matching and fusion (MSAMF) network, which makes full use of multi-slice reference images information by introducing a multi-slice aware module and multi-scale matching strategy to capture corresponding contextual information in reference features at other scales. To further integrate matching features, a multi-scale fusion mechanism is also designed to progressively fuse multi-scale matching features, thereby generating more detailed super-resolution images. The experimental results support the benefits of our network in enhancing the quality of brain MRI reconstruction.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 5","pages":"1411-1421"},"PeriodicalIF":7.3,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70032","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Contrastive Learning-Based Multi-Level Knowledge Distillation 基于对比学习的多层次知识蒸馏

IF 7.3 2区计算机科学

CAAI Transactions on Intelligence Technology Pub Date : 2025-07-04 DOI: 10.1049/cit2.70036

Lin Li, Jianping Gou, Weihua Ou, Wenbai Chen, Lan Du

{"title":"Contrastive Learning-Based Multi-Level Knowledge Distillation","authors":"Lin Li, Jianping Gou, Weihua Ou, Wenbai Chen, Lan Du","doi":"10.1049/cit2.70036","DOIUrl":"https://doi.org/10.1049/cit2.70036","url":null,"abstract":"With the increasing constraints of hardware devices, there is a growing demand for compact models to be deployed on device endpoints. Knowledge distillation, a widely used technique for model compression and knowledge transfer, has gained significant attention in recent years. However, traditional distillation approaches compare the knowledge of individual samples indirectly through class prototypes overlooking the structural relationships between samples. Although recent distillation methods based on contrastive learning can capture relational knowledge, their relational constraints often distort the positional information of the samples leading to compromised performance in the distilled model. To address these challenges and further enhance the performance of compact models, we propose a novel approach, termed contrastive learning-based multi-level knowledge distillation (CLMKD). The CLMKD framework introduces three key modules: class-guided contrastive distillation, gradient relation contrastive distillation, and semantic similarity distillation. These modules are effectively integrated into a unified framework to extract feature knowledge from multiple levels, capturing not only the representational consistency of individual samples but also their higher-order structure and semantic similarity. We evaluate the proposed CLMKD method on multiple image classification datasets and the results demonstrate its superior performance compared to state-of-the-art knowledge distillation methods.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 5","pages":"1478-1488"},"PeriodicalIF":7.3,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70036","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Design of a Marker-Based Human–Robot Following Motion Control Strategy 基于标记的人-机器人跟随运动控制策略设计

IF 7.3 2区计算机科学

CAAI Transactions on Intelligence Technology Pub Date : 2025-07-01 DOI: 10.1049/cit2.70023

Zhigang Zhang, Yongsheng Guo, Xiaoxia Yu, Shuaishuai Ge

{"title":"Design of a Marker-Based Human–Robot Following Motion Control Strategy","authors":"Zhigang Zhang, Yongsheng Guo, Xiaoxia Yu, Shuaishuai Ge","doi":"10.1049/cit2.70023","DOIUrl":"https://doi.org/10.1049/cit2.70023","url":null,"abstract":"To address the challenges of jerky movements and poor tracking performance in outdoor environments for a following-type mobile robot, a novel marker-based human–machine following-motion control strategy is explored. This strategy decouples the control of linear velocity and angular velocity, handling them separately. First, in the design of linear-velocity control, using the identification of markers to determine the distance between the human and the robot, an enhanced virtual spring model is developed. This involves designing a weighted dynamic damping coefficient to address the rationality issues of the range and trend of the robot's following speed, thereby improving its smoothness and reducing the risk of target loss. Second, in the design of angular velocity control, a new concept of an ‘insensitive zone’ based on the offset of the marker's centre point is proposed, combined with a fuzzy controller to address the issue of robot jitter and enhance its resistance to interference. The experimental results indicate that the average variance in the human–robot distance is 1.037 m, whereas the average variance in the robot's linear velocity is 0.345 m/s. Due to the design of an insensitive region in parameter-adaptive fuzzy control, the average variance of angular velocity is only 0.031 rad/s. When the human–robot distance exhibits significant fluctuations, the fluctuations in both linear and angular velocities are comparatively small, allowing for stable and smooth following movements. This demonstrates the effectiveness of the motion control strategy designed in this study.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 5","pages":"1489-1500"},"PeriodicalIF":7.3,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70023","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Deep Backtracking Bare-Bones Particle Swarm Optimisation Algorithm for High-Dimensional Nonlinear Functions 高维非线性函数的深度回溯裸骨架粒子群优化算法

IF 7.3 2区计算机科学

CAAI Transactions on Intelligence Technology Pub Date : 2025-07-01 DOI: 10.1049/cit2.70028

Jia Guo, Guoyuan Zhou, Ke Yan, Yi Di, Yuji Sato, Zhou He, Binghua Shi

{"title":"A Deep Backtracking Bare-Bones Particle Swarm Optimisation Algorithm for High-Dimensional Nonlinear Functions","authors":"Jia Guo, Guoyuan Zhou, Ke Yan, Yi Di, Yuji Sato, Zhou He, Binghua Shi","doi":"10.1049/cit2.70028","DOIUrl":"https://doi.org/10.1049/cit2.70028","url":null,"abstract":"The challenge of optimising multimodal functions within high-dimensional domains constitutes a notable difficulty in evolutionary computation research. Addressing this issue, this study introduces the Deep Backtracking Bare-Bones Particle Swarm Optimisation (DBPSO) algorithm, an innovative approach built upon the integration of the Deep Memory Storage Mechanism (DMSM) and the Dynamic Memory Activation Strategy (DMAS). The DMSM enhances the memory retention for the globally optimal particle, promoting interaction between standard particles and their historically optimal counterparts. In parallel, DMAS assures the updated position of the globally optimal particle is appropriately aligned with the deep memory repository. The efficacy of DBPSO was rigorously assessed through a series of simulations employing the CEC2017 benchmark suite. A comparative analysis juxtaposed DBPSO's performance against five contemporary evolutionary algorithms across two experimental conditions: Dimension-50 and Dimension-100. In the 50D trials, DBPSO attained an average ranking of 2.03, whereas in the 100D scenarios, it improved to an average ranking of 1.9. Further examination utilising the CEC2019 benchmark functions revealed DBPSO's robustness, securing four first-place finishes, three second-place standings, and three third-place positions, culminating in an unmatched average ranking of 1.9 across all algorithms. These empirical results corroborate DBPSO's proficiency in delivering precise solutions for complex, high-dimensional optimisation challenges.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 5","pages":"1501-1520"},"PeriodicalIF":7.3,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70028","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Guest Editorial: Special Issue on Al Technologies and Applications in Medical Robots 特刊：人工智能技术及其在医疗机器人中的应用

IF 7.3 2区计算机科学

CAAI Transactions on Intelligence Technology Pub Date : 2025-06-28 DOI: 10.1049/cit2.70019

Xiaozhi Qi, Zhongliang Jiang, Ying Hu, Jianwei Zhang

{"title":"Guest Editorial: Special Issue on Al Technologies and Applications in Medical Robots","authors":"Xiaozhi Qi, Zhongliang Jiang, Ying Hu, Jianwei Zhang","doi":"10.1049/cit2.70019","DOIUrl":"10.1049/cit2.70019","url":null,"abstract":"The integration of artificial intelligence (AI) into medical robotics has emerged as a cornerstone of modern healthcare, driving transformative advancements in precision, adaptability and patient outcomes. Although computational tools have long supported diagnostic processes, their role is evolving beyond passive assistance to become active collaborators in therapeutic decision-making. In this paradigm, knowledge-driven deep learning systems are redefining possibilities—enabling robots to interpret complex data, adapt to dynamic clinical environments and execute tasks with human-like contextual awareness.The purpose of this special issue is to showcase the latest developments in the application of AI technology in medical robots. The main content includes but is not limited to passive data adaptation, force feedback tracking, image processing and diagnosis, surgical navigation, exoskeleton systems etc. These studies cover various application scenarios of medical robots, with the ultimate goal of maximising AI autonomy.We have received 31 paper submissions from around the world, and after a rigorous peer review process, we have finally selected 9 papers for publication. The selected collection of papers covers various fascinating research topics, all of which have achieved key breakthroughs in their respective fields. We believe that these accepted papers have guiding significance for their research fields and can help researchers enhance their understanding of current trends. Sincere thanks to the authors who chose our platform and all the staff who provided assistance for the publication of these papers.In the article ‘Model adaptation via credible local context representation’, Tang et al. pointed out that conventional model transfer techniques require labelled source data, which makes them inapplicable in privacy-sensitive medical domains. To address these critical problems of source-free domain adaptation (SFDA), they proposed a credible local context representation (CLCR) method that significantly enhances model generalisation through geometric structure mining in feature space. This method innovatively constructs a two-stage learning framework: introducing a data-enhanced mutual information regularisation term in the pretraining stage of the source model to enhance the model's learning of sample discriminative features; design a deep space fixed step walking strategy during the target domain adaptation phase, dynamically capture the local credible contextual features of each target sample and use them as pseudo-labels for semantic fusion. Experiments on the three benchmark datasets of Office-31, Office Home and VisDA show that CLCR achieves an average accuracy of 89.2% in 12 cross-domain tasks, which is 3.1% higher than the existing optimal SFDA method and even surpasses some domain adaptation methods that require the participation of source data. This work provides a new approach to address the privacy performance c","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 3","pages":"635-637"},"PeriodicalIF":7.3,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70019","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144503207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Lightweight YOLOv5 Target Detection Model and Its Application to the Measurement of 100-Kernel Weight of Corn Seeds 一种轻量级的YOLOv5目标检测模型及其在玉米种子百粒重测量中的应用

IF 7.3 2区计算机科学

CAAI Transactions on Intelligence Technology Pub Date : 2025-06-28 DOI: 10.1049/cit2.70031

Helong Yu, Jiayao Zhao, Chun Guang Bi, Lei Shi, Huiling Chen

{"title":"A Lightweight YOLOv5 Target Detection Model and Its Application to the Measurement of 100-Kernel Weight of Corn Seeds","authors":"Helong Yu, Jiayao Zhao, Chun Guang Bi, Lei Shi, Huiling Chen","doi":"10.1049/cit2.70031","DOIUrl":"https://doi.org/10.1049/cit2.70031","url":null,"abstract":"The 100-kernel weight of corn seed is a crucial metric for assessing corn quality, and the current measurement means mostly involve manual counting of kernels followed by weighing on a balance, which is labour-intensive and time-consuming. Aiming to address the problem of low efficiency in measuring the 100-kernel weight of corn seeds, this study proposes a measurement method based on deep learning and machine vision. In this study, high-contrast camera technology was utilised to capture image data of corn seeds. And improvements were made to the feature extraction network of the YOLOv5 model by incorporating the MobileNetV3 network structure. The novel model employs deep separable convolution to decrease parameters and computational load. It incorporates a linear bottleneck and inverted residual structure to enhance efficiency. It introduces an SE attention mechanism for direct learning of channel number features and updates the activation function. Algorithms and experiments were subsequently designed to calculate the 100-grain weight in conjunction with the output of the model. The outcomes revealed that the enhanced model in this study achieved an accuracy of 90.1%, a recall rate of 91.3%, and a mAP (mean average precision) value of 92.2%. While meeting production requirements, this model significantly reduces the number of parameters compared to alternative models—50% of the original model. In an applied study focused on measuring the 100-kernel weight of corn seeds, the counting accuracy yielded a remarkable 97.18%, while the accuracy for weight measurement results reached 94.2%. This study achieves both efficient and precise measurement of the 100-kernel weight of maize seeds, presenting a novel perspective in the exploration of maize seed weight.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 5","pages":"1521-1534"},"PeriodicalIF":7.3,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70031","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0