Complex & Intelligent Systems最新文献

筛选
英文 中文
Molecular subgraph representation learning based on spatial structure transformer 基于空间结构转换器的分子子图表示学习
IF 5.8 2区 计算机科学
Complex & Intelligent Systems Pub Date : 2024-08-14 DOI: 10.1007/s40747-024-01602-0
Shaoguang Zhang, Jianguang Lu, Xianghong Tang
{"title":"Molecular subgraph representation learning based on spatial structure transformer","authors":"Shaoguang Zhang, Jianguang Lu, Xianghong Tang","doi":"10.1007/s40747-024-01602-0","DOIUrl":"https://doi.org/10.1007/s40747-024-01602-0","url":null,"abstract":"<p>In the field of molecular biology, graph representation learning is crucial for molecular structure analysis. However, challenges arise in recognising functional groups and distinguishing isomers due to a lack of spatial structure information. To address these problems, we design a novel graph representation learning method based on a spatial structure information extraction Transformer (SSET). The SSET model comprises the Edge Feature Fusion Subgraph Spatial Structure Extractor (ETSE) module and the Positional Information Encoding Graph Transformer (PEGT) module. The ETSE module extracts spatial structural information by fusing edge features and generating the most-value subgraph (Mv-subgraph). The PEGT module encodes positional information based on the graph transformer, addressing the indistinguishability problem among nodes with identical features. In addition, the SSET model alleviates the burden of high computational complexity by using subgraph. Experiments on real datasets show that the SSET model, built on the graph transformer, considerably improves graph representation learning.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"29 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multi-level collaborative self-distillation learning for improving adaptive inference efficiency 提高自适应推理效率的多层次协作式自馏学习
IF 5.8 2区 计算机科学
Complex & Intelligent Systems Pub Date : 2024-08-14 DOI: 10.1007/s40747-024-01572-3
Likun Zhang, Jinbao Li, Benqian Zhang, Yahong Guo
{"title":"A multi-level collaborative self-distillation learning for improving adaptive inference efficiency","authors":"Likun Zhang, Jinbao Li, Benqian Zhang, Yahong Guo","doi":"10.1007/s40747-024-01572-3","DOIUrl":"https://doi.org/10.1007/s40747-024-01572-3","url":null,"abstract":"<p>A multi-exit network is an important technique for achieving adaptive inference by dynamically allocating computational resources based on different input samples. The existing works mainly treat the final classifier as the teacher, enhancing the classification accuracy by transferring knowledge to the intermediate classifiers. However, this traditional self-distillation training strategy only utilizes the knowledge contained in the final classifier, neglecting potentially distinctive knowledge in the other classifiers. To address this limitation, we propose a novel multi-level collaborative self-distillation learning strategy (MLCSD) that extracts knowledge from all the classifiers. MLCSD dynamically determines the weight coefficients for each classifier’s contribution through a learning process, thus constructing more comprehensive and effective teachers tailored to each classifier. These new teachers transfer the knowledge back to each classifier through a distillation technique, thereby further improving the network’s inference efficiency. We conduct experiments on three datasets, CIFAR10, CIFAR100, and Tiny-ImageNet. Compared with the baseline network that employs traditional self-distillation, our MLCSD-Net based on ResNet18 enhances the average classification accuracy by 1.18%. The experimental results demonstrate that MLCSD-Net improves the inference efficiency of adaptive inference applications, such as anytime prediction and budgeted batch classification. Code is available at https://github.com/deepzlk/MLCSD-Net.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"23 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141986604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Swarm mutual learning 蜂群相互学习
IF 5.8 2区 计算机科学
Complex & Intelligent Systems Pub Date : 2024-08-14 DOI: 10.1007/s40747-024-01573-2
Kang Haiyan, Wang Jiakang
{"title":"Swarm mutual learning","authors":"Kang Haiyan, Wang Jiakang","doi":"10.1007/s40747-024-01573-2","DOIUrl":"https://doi.org/10.1007/s40747-024-01573-2","url":null,"abstract":"<p>With the rapid growth of big data, extracting meaningful knowledge from data is crucial for machine learning. The existing Swarm Learning data collaboration models face challenges such as data security, model security, high communication overhead, and model performance optimization. To address this, we propose the Swarm Mutual Learning (SML). Firstly, we introduce an Adaptive Mutual Distillation Algorithm that dynamically controls the learning intensity based on distillation weights and strength, enhancing the efficiency of knowledge extraction and transfer during mutual distillation. Secondly, we design a Global Parameter Aggregation Algorithm based on homomorphic encryption, coupled with a Dynamic Gradient Decomposition Algorithm using singular value decomposition. This allows the model to aggregate parameters in ciphertext, significantly reducing communication overhead during uploads and downloads. Finally, we validate the proposed methods on real datasets, demonstrating their effectiveness and efficiency in model updates. On the MNIST dataset and CIFAR-10 dataset, the local model accuracies reached 95.02% and 55.26%, respectively, surpassing those of the comparative models. Furthermore, while ensuring the security of the aggregation process, we significantly reduced the communication overhead for uploading and downloading.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"79 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TARGCN: temporal attention recurrent graph convolutional neural network for traffic prediction TARGCN:用于交通预测的时间注意力递归图卷积神经网络
IF 5.8 2区 计算机科学
Complex & Intelligent Systems Pub Date : 2024-08-14 DOI: 10.1007/s40747-024-01601-1
He Yang, Cong Jiang, Yun Song, Wendong Fan, Zelin Deng, Xinke Bai
{"title":"TARGCN: temporal attention recurrent graph convolutional neural network for traffic prediction","authors":"He Yang, Cong Jiang, Yun Song, Wendong Fan, Zelin Deng, Xinke Bai","doi":"10.1007/s40747-024-01601-1","DOIUrl":"https://doi.org/10.1007/s40747-024-01601-1","url":null,"abstract":"<p>Traffic prediction is crucial to the intelligent transportation system. However, accurate traffic prediction still faces challenges. It is difficult to extract dynamic spatial–temporal correlations of traffic flow and capture the specific traffic pattern for each sub-region. In this paper, a temporal attention recurrent graph convolutional neural network (TARGCN) is proposed to address these issues. The proposed TARGCN model fuses a node-embedded graph convolutional (Emb-GCN) layer, a gated recurrent unit (GRU) layer, and a temporal attention (TA) layer into a framework to exploit both dynamic spatial correlations between traffic nodes and temporal dependencies between time slices. In the Emb-GCN layer, node embedding matrix and node parameter learning techniques are employed to extract spatial correlations between traffic nodes at a fine-grained level and learn the specific traffic pattern for each node. Following this, a series of gated recurrent units are stacked as a GRU layer to capture spatial and temporal features from the traffic flow of adjacent nodes in the past few time slices simultaneously. Furthermore, an attention layer is applied in the temporal dimension to extend the receptive field of GRU. The combination of the Emb-GCN, GRU, and the TA layer facilitates the proposed framework exploiting not only the spatial–temporal dependencies but also the degree of interconnectedness between traffic nodes, which benefits the prediction a lot. Experiments on public traffic datasets PEMSD4 and PEMSD8 demonstrate the effectiveness of the proposed method. Compared with state-of-the-art baselines, it achieves 4.62% and 5.78% on PEMS03, 3.08% and 0.37% on PEMSD4, and 5.08% and 0.28% on PEMSD8 superiority on average. Especially for long-term prediction, prediction results for the 60-min interval show the proposed method presents a more notable advantage over compared benchmarks. The implementation on Pytorch is publicly available at https://github.com/csust-sonie/TARGCN.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"141 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MFPIDet: improved YOLOV7 architecture based on multi-scale feature fusion for prohibited item detection in complex environment MFPIDet:基于多尺度特征融合的改进型 YOLOV7 架构,用于在复杂环境中检测违禁物品
IF 5.8 2区 计算机科学
Complex & Intelligent Systems Pub Date : 2024-08-14 DOI: 10.1007/s40747-024-01580-3
Lang Zhang, Zhan Ao Huang, Canghong Shi, Hongjiang Ma, Xiaojie Li, Xi Wu
{"title":"MFPIDet: improved YOLOV7 architecture based on multi-scale feature fusion for prohibited item detection in complex environment","authors":"Lang Zhang, Zhan Ao Huang, Canghong Shi, Hongjiang Ma, Xiaojie Li, Xi Wu","doi":"10.1007/s40747-024-01580-3","DOIUrl":"https://doi.org/10.1007/s40747-024-01580-3","url":null,"abstract":"<p>Prohibited item detection is crucial for the safety of public places. Deep learning, one of the mainstream methods in prohibited item detection tasks, has shown superior performance far beyond traditional prohibited item detection methods. However, most neural network architectures in deep learning still lack sufficient local feature representation ability for overlapping and small targets, and ignore the problem of semantic conflicts caused by direct feature fusion. In this paper, we propose MFPIDet, a novel prohibited item detection neural network architecture based on improved YOLOV7 to achieve reliable prohibited item detection in complex environments. Specifically, a multi-scale attention module (MAM) backbone is proposed to filter the redundant information of target regions and further applied to enhance the local feature representation ability of overlapping objects. Here, to reduce the redundant information of target regions, a squeeze-excitation (SE) block is used to filter the background. Then, aiming at enhancing the feature expression ability of overlapping objects, a multi-scale feature extraction module (MFEM) is designed for local feature representation. In addition, to obtain richer context information, We design an adaptive fusion feature pyramid network (AF-FPN) to combine the adaptive context information fusion module (ACIFM) with the feature fusion module (FFM) to improve the neck structure of YOLOV7. The proposed method is validated on the PIDray dataset, and the tested results showed that our method obtained the highest <i>mAP</i> (68.7%), which is improved by 3.5% than YOLOV7 methods. Our approach provides a new design pattern for prohibited item detection in complex environments and shows the development potential of deep learning in related fields.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"16 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A teacher-guided early-learning method for medical image segmentation from noisy labels 一种教师指导的早期学习方法,用于从噪声标签中分割医学图像
IF 5.8 2区 计算机科学
Complex & Intelligent Systems Pub Date : 2024-08-13 DOI: 10.1007/s40747-024-01574-1
Shangkun Liu, Minghao Zou, Ning Liu, Yanxin Li, Weimin Zheng
{"title":"A teacher-guided early-learning method for medical image segmentation from noisy labels","authors":"Shangkun Liu, Minghao Zou, Ning Liu, Yanxin Li, Weimin Zheng","doi":"10.1007/s40747-024-01574-1","DOIUrl":"https://doi.org/10.1007/s40747-024-01574-1","url":null,"abstract":"<p>The success of current deep learning models depends on a large number of precise labels. However, in the field of medical image segmentation, acquiring precise labels is labor-intensive and time-consuming. Hence, the challenge of achieving a high-performance model via datasets containing noisy labels has attracted significant research interest. Some existing methods are unable to exclude samples containing noisy labels and some methods still have high requirements on datasets. To solve this problem, we propose a noisy label learning method for medical image segmentation using a mixture of high and low quality labels based on the architecture of mean teacher. Firstly, considering the teacher model’s capacity to aggregate all previously learned information following each training step, we propose to leverage a teacher model to correct noisy label adaptively during the training phase. Secondly, to enhance the model’s robustness, we propose to infuse feature perturbations into the student model. This strategy aims to bolster the model’s ability to handle variations in input data and improve its resilience to noisy labels. Finally, we simulate noisy labels by destroying labels in two medical image datasets: the Automated Cardiac Diagnosis Challenge (ACDC) dataset and the 3D Left Atrium (LA) dataset. Experiments show that the proposed method demonstrates considerable effectiveness. With a noisy ratio of 0.8, compared with other methods, the mean Dice score of our proposed method is improved by 2.58% and 0.31% on ACDC and LA datasets, respectively.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"17 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141973843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online optimal tracking control of unknown nonlinear singularly perturbed systems using single network adaptive critic with improved learning 利用改进学习的单网络自适应批判器实现未知非线性奇异扰动系统的在线优化跟踪控制
IF 5.8 2区 计算机科学
Complex & Intelligent Systems Pub Date : 2024-08-13 DOI: 10.1007/s40747-024-01598-7
Zhijun Fu, Bao Ma, Dengfeng Zhao, Yuming Yin
{"title":"Online optimal tracking control of unknown nonlinear singularly perturbed systems using single network adaptive critic with improved learning","authors":"Zhijun Fu, Bao Ma, Dengfeng Zhao, Yuming Yin","doi":"10.1007/s40747-024-01598-7","DOIUrl":"https://doi.org/10.1007/s40747-024-01598-7","url":null,"abstract":"<p>This study is the first time devoted to seek an online optimal tracking solution for unknown nonlinear singularly perturbed systems based on single network adaptive critic (SNAC) design. Firstly, a novel identifier with more efficient parametric multi-time scales differential neural network (PMTSDNN) is developed to obtain the unknown system dynamics. Then, based on the identification results, the online optimal tracking controller consists of an adaptive steady control term and an optimal feedback control term is developed by using SNAC to solve the Hamilton–Jacobi–Bellman (HJB) equation online. New learning law considering filtered parameter identification error is developed for the PMTSDNN identifier and the SNAC, which can realize online synchronous learning and fast convergence. The Lyapunov approach is synthesized to ensure the convergence characteristics of the overall closed loop system consisting of the PMTSDNN identifier, the SNAC and the optimal tracking control policy. Three examples are provided to illustrate the effectiveness of the investigated method.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"7 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141973842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Segmentation-aware relational graph convolutional network with multi-layer CRF for nested named entity recognition 用于嵌套命名实体识别的分段感知关系图卷积网络与多层 CRF
IF 5.8 2区 计算机科学
Complex & Intelligent Systems Pub Date : 2024-08-10 DOI: 10.1007/s40747-024-01551-8
Daojun Han, Zemin Wang, Yunsong Li, Xiangbo ma, Juntao Zhang
{"title":"Segmentation-aware relational graph convolutional network with multi-layer CRF for nested named entity recognition","authors":"Daojun Han, Zemin Wang, Yunsong Li, Xiangbo ma, Juntao Zhang","doi":"10.1007/s40747-024-01551-8","DOIUrl":"https://doi.org/10.1007/s40747-024-01551-8","url":null,"abstract":"<p>Named Entity Recognition (NER) is fundamental in natural language processing, involving identifying entity spans and types within a sentence. Nested NER contains other entities, which pose a significant challenge, especially pronounced in the domain of medical-named entities due to intricate nesting patterns inherent in medical terminology. Existing studies can not capture interdependencies among different entity categories, resulting in inadequate performance in nested NER tasks. To address this problem, we propose a novel <b>L</b>ayer-based architecture with <b>S</b>egmentation-aware <b>R</b>elational <b>G</b>raph <b>C</b>onvolutional <b>N</b>etwork (LSRGCN) for Nested NER in the medical domain. LSRGCN comprises two key modules: a shared segmentation-aware encoder and a multi-layer conditional random field decoder. The former part provides token representation including boundary information from sentence segmentation. The latter part can learn the connections between different entity classes and improve recognition accuracy through secondary decoding. We conduct experiments on four datasets. Experimental results demonstrate the effectiveness of our model. Additionally, extensive studies are conducted to enhance our understanding of the model and its capabilities.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"36 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141910469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalized spatial–temporal regression graph convolutional transformer for traffic forecasting 用于交通预测的广义时空回归图卷积变换器
IF 5.8 2区 计算机科学
Complex & Intelligent Systems Pub Date : 2024-08-10 DOI: 10.1007/s40747-024-01578-x
Lang Xiong, Liyun Su, Shiyi Zeng, Xiangjing Li, Tong Wang, Feng Zhao
{"title":"Generalized spatial–temporal regression graph convolutional transformer for traffic forecasting","authors":"Lang Xiong, Liyun Su, Shiyi Zeng, Xiangjing Li, Tong Wang, Feng Zhao","doi":"10.1007/s40747-024-01578-x","DOIUrl":"https://doi.org/10.1007/s40747-024-01578-x","url":null,"abstract":"<p>Spatial–temporal data is widely available in intelligent transportation systems, and accurately solving non-stationary of spatial–temporal regression is critical. In most traffic flow prediction research, the non-stationary solution of deep spatial–temporal regression tasks is typically formulated as a spatial–temporal graph modeling problem. However, there are several issues: (1) the coupled spatial–temporal regression approach renders it unfeasible to accurately learn the dependencies of diverse modalities; (2) the intricate stacking design of deep spatial–temporal network modules limits the interpretation and migration capability; (3) the ability to model dynamic spatial–temporal relationships is inadequate. To tackle the challenges mentioned above, we propose a novel unified spatial–temporal regression framework named Generalized Spatial–Temporal Regression Graph Convolutional Transformer (GSTRGCT) that extends panel model in spatial econometrics and combines it with deep neural networks to effectively model non-stationary relationships of spatial–temporal regression. Considering the coupling of existing deep spatial–temporal networks, we introduce the tensor decomposition to explicitly decompose the panel model into a tensor product of spatial regression on the spatial hyper-plane and temporal regression on the temporal hyper-plane. On the spatial hyper-plane, we present dynamic adaptive spatial weight network (DASWNN) to capture the global and local spatial correlations. Specifically, DASWNN adopts spatial weight neural network (SWNN) to learn the semantic global spatial correlation and dynamically adjusts the local changing spatial correlation by multiplying between spatial nodes embedding. On the temporal hyper-plane, we introduce the Auto-Correlation attention mechanism to capture the period-based temporal dependence. Extensive experiments on the two real-world traffic datasets show that GSTRGCT consistently outperforms other competitive methods with an average of 62% and 59% on predictive performance.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"103 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141910465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Repmono: a lightweight self-supervised monocular depth estimation architecture for high-speed inference Repmono:用于高速推理的轻量级自监督单目深度估计架构
IF 5.8 2区 计算机科学
Complex & Intelligent Systems Pub Date : 2024-08-10 DOI: 10.1007/s40747-024-01575-0
Guowei Zhang, Xincheng Tang, Li Wang, Huankang Cui, Teng Fei, Hulin Tang, Shangfeng Jiang
{"title":"Repmono: a lightweight self-supervised monocular depth estimation architecture for high-speed inference","authors":"Guowei Zhang, Xincheng Tang, Li Wang, Huankang Cui, Teng Fei, Hulin Tang, Shangfeng Jiang","doi":"10.1007/s40747-024-01575-0","DOIUrl":"https://doi.org/10.1007/s40747-024-01575-0","url":null,"abstract":"<p>Self-supervised monocular depth estimation has always attracted attention because it does not require ground truth data. Designing a lightweight architecture capable of fast inference is crucial for deployment on mobile devices. The current network effectively integrates Convolutional Neural Networks (CNN) with Transformers, achieving significant improvements in accuracy. However, this advantage comes at the cost of an increase in model size and a significant reduction in inference speed. In this study, we propose a network named Repmono, which includes LCKT module with a large convolutional kernel and RepTM module based on the structural reparameterisation technique. With the combination of these two modules, our network achieves both local and global feature extraction with a smaller number of parameters and significantly enhances inference speed. Our network, with 2.31MB parameters, shows significant accuracy improvements over Monodepth2 in experiments on the KITTI dataset. With uniform input dimensions, our network’s inference speed is 53.7% faster than R-MSFM6, 60.1% faster than Monodepth2, and 81.1% faster than MonoVIT-small. Our code is available at https://github.com/txc320382/Repmono.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"12 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141910475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信