CAAI Transactions on Intelligence Technology最新文献

筛选
英文 中文
Prediction and optimisation of gasoline quality in petroleum refining: The use of machine learning model as a surrogate in optimisation framework 预测和优化石油炼制过程中的汽油质量:在优化框架中使用机器学习模型作为替代品
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-05-13 DOI: 10.1049/cit2.12324
Husnain Saghir, Iftikhar Ahmad, Manabu Kano, Hakan Caliskan, Hiki Hong
{"title":"Prediction and optimisation of gasoline quality in petroleum refining: The use of machine learning model as a surrogate in optimisation framework","authors":"Husnain Saghir,&nbsp;Iftikhar Ahmad,&nbsp;Manabu Kano,&nbsp;Hakan Caliskan,&nbsp;Hiki Hong","doi":"10.1049/cit2.12324","DOIUrl":"10.1049/cit2.12324","url":null,"abstract":"<p>Hardware-based sensing frameworks such as cooperative fuel research engines are conventionally used to monitor research octane number (RON) in the petroleum refining industry. Machine learning techniques are employed to predict the RON of integrated naphtha reforming and isomerisation processes. A dynamic Aspen HYSYS model was used to generate data by introducing artificial uncertainties in the range of ±5% in process conditions, such as temperature, flow rates, etc. The generated data was used to train support vector machines (SVM), Gaussian process regression (GPR), artificial neural networks (ANN), regression trees (RT), and ensemble trees (ET). Hyperparameter tuning was performed to enhance the prediction capabilities of GPR, ANN, SVM, ET and RT models. Performance analysis of the models indicates that GPR, ANN, and SVM with <i>R</i><sup>2</sup> values of 0.99, 0.978, and 0.979 and RMSE values of 0.108, 0.262, and 0.258, respectively performed better than the remaining models and had the prediction capability to capture the RON dependence on predictor variables. ET and RT had an <i>R</i><sup>2</sup> value of 0.94 and 0.89, respectively. The GPR model was used as a surrogate model for fitness function evaluations in two optimisation frameworks based on genetic algorithm and particle swarm method. Optimal parameter values found by the optimisation methodology increased the RON value by 3.52%. The proposed methodology of surrogate-based optimisation will provide a platform for plant-level implementation to realise the concept of industry 4.0 in the refinery.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 5","pages":"1185-1198"},"PeriodicalIF":8.4,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12324","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140983874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inferring causal protein signalling networks from single-cell data based on parallel discrete artificial bee colony algorithm 基于并行离散人工蜂群算法从单细胞数据中推断因果蛋白质信号网络
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-05-11 DOI: 10.1049/cit2.12344
Jinduo Liu, Jihao Zhai, Junzhong Ji
{"title":"Inferring causal protein signalling networks from single-cell data based on parallel discrete artificial bee colony algorithm","authors":"Jinduo Liu,&nbsp;Jihao Zhai,&nbsp;Junzhong Ji","doi":"10.1049/cit2.12344","DOIUrl":"10.1049/cit2.12344","url":null,"abstract":"<p>Inferring causal protein signalling networks from human immune system cell data is a promising approach to unravel the underlying tissue signalling biology and dysfunction in diseased cells, which has attracted considerable attention within the bioinformatics field. Recently, Bayesian network (BN) techniques have gained significant popularity in inferring causal protein signalling networks from multiparameter single-cell data. However, current BN methods may exhibit high computational complexity and ignore interactions among protein signalling molecules from different single cells. A novel BN method is presented for learning causal protein signalling networks based on parallel discrete artificial bee colony (PDABC), named PDABC. Specifically, PDABC is a score-based BN method that utilises the parallel artificial bee colony to search for the global optimal causal protein signalling networks with the highest discrete K2 metric. The experimental results on several simulated datasets, as well as a previously published multi-parameter fluorescence-activated cell sorter dataset, indicate that PDABC surpasses the existing state-of-the-art methods in terms of performance and computational efficiency.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 6","pages":"1587-1604"},"PeriodicalIF":8.4,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12344","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140989528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Residual multimodal Transformer for expression-EEG fusion continuous emotion recognition 用于表情-EEG 融合连续情绪识别的残差多模态变换器
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-05-08 DOI: 10.1049/cit2.12346
Xiaofang Jin, Jieyu Xiao, Libiao Jin, Xinruo Zhang
{"title":"Residual multimodal Transformer for expression-EEG fusion continuous emotion recognition","authors":"Xiaofang Jin,&nbsp;Jieyu Xiao,&nbsp;Libiao Jin,&nbsp;Xinruo Zhang","doi":"10.1049/cit2.12346","DOIUrl":"10.1049/cit2.12346","url":null,"abstract":"<p>Continuous emotion recognition is to predict emotion states through affective information and more focus on the continuous variation of emotion. Fusion of electroencephalography (EEG) and facial expressions videos has been used in this field, while there are with some limitations in current researches, such as hand-engineered features, simple approaches to integration. Hence, a new continuous emotion recognition model is proposed based on the fusion of EEG and facial expressions videos named residual multimodal Transformer (RMMT). Firstly, the Resnet50 and temporal convolutional network (TCN) are utilised to extract spatiotemporal features from videos, and the TCN is also applied to process the computed EEG frequency power to acquire spatiotemporal features of EEG. Then, a multimodal Transformer is used to fuse the spatiotemporal features from the two modalities. Furthermore, a residual connection is introduced to fuse shallow features with deep features which is verified to be effective for continuous emotion recognition through experiments. Inspired by knowledge distillation, the authors incorporate feature-level loss into the loss function to further enhance the network performance. Experimental results show that the RMMT reaches a superior performance over other methods for the MAHNOB-HCI dataset. Ablation studies on the residual connection and loss function in the RMMT demonstrate that both of them is functional.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 5","pages":"1290-1304"},"PeriodicalIF":8.4,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12346","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141000683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Join multiple Riemannian manifold representation and multi-kernel non-redundancy for image clustering 加入多黎曼流形表示和多核非冗余性以进行图像聚类
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-05-08 DOI: 10.1049/cit2.12347
Mengyuan Zhang, Jinglei Liu
{"title":"Join multiple Riemannian manifold representation and multi-kernel non-redundancy for image clustering","authors":"Mengyuan Zhang,&nbsp;Jinglei Liu","doi":"10.1049/cit2.12347","DOIUrl":"10.1049/cit2.12347","url":null,"abstract":"<p>Image clustering has received significant attention due to the growing importance of image recognition. Researchers have explored Riemannian manifold clustering, which is capable of capturing the non-linear shapes found in real-world datasets. However, the complexity of image data poses substantial challenges for modelling and feature extraction. Traditional methods such as covariance matrices and linear subspace have shown promise in image modelling, and they are still in their early stages and suffer from certain limitations. However, these include the uncertainty of representing data using only one Riemannian manifold, limited feature extraction capacity of single kernel functions, and resulting incomplete data representation and redundancy. To overcome these limitations, the authors propose a novel approach called join multiple Riemannian manifold representation and multi-kernel non-redundancy for image clustering (MRMNR-MKC). It combines covariance matrices with linear subspace to represent data and applies multiple kernel functions to map the non-linear structural data into a reproducing kernel Hilbert space, enabling linear model analysis for image clustering. Additionally, the authors use matrix-induced regularisation to improve the clustering kernel selection process by reducing redundancy and assigning lower weights to identical kernels. Finally, the authors also conducted numerous experiments to evaluate the performance of our approach, confirming its superiority to state-of-the-art methods on three benchmark datasets.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 5","pages":"1305-1319"},"PeriodicalIF":8.4,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12347","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140999031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepGCN based on variable multi-graph and multimodal data for ASD diagnosis 基于可变多图和多模态数据的 DeepGCN,用于 ASD 诊断
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-05-03 DOI: 10.1049/cit2.12340
Shuaiqi Liu, Siqi Wang, Chaolei Sun, Bing Li, Shuihua Wang, Fei Li
{"title":"DeepGCN based on variable multi-graph and multimodal data for ASD diagnosis","authors":"Shuaiqi Liu,&nbsp;Siqi Wang,&nbsp;Chaolei Sun,&nbsp;Bing Li,&nbsp;Shuihua Wang,&nbsp;Fei Li","doi":"10.1049/cit2.12340","DOIUrl":"10.1049/cit2.12340","url":null,"abstract":"<p>Diagnosing individuals with autism spectrum disorder (ASD) accurately faces great challenges in clinical practice, primarily due to the data's high heterogeneity and limited sample size. To tackle this issue, the authors constructed a deep graph convolutional network (GCN) based on variable multi-graph and multimodal data (VMM-DGCN) for ASD diagnosis. Firstly, the functional connectivity matrix was constructed to extract primary features. Then, the authors constructed a variable multi-graph construction strategy to capture the multi-scale feature representations of each subject by utilising convolutional filters with varying kernel sizes. Furthermore, the authors brought the non-imaging information into the feature representation at each scale and constructed multiple population graphs based on multimodal data by fully considering the correlation between subjects. After extracting the deeper features of population graphs using the deep GCN(DeepGCN), the authors fused the node features of multiple subgraphs to perform node classification tasks for typical control and ASD patients. The proposed algorithm was evaluated on the Autism Brain Imaging Data Exchange I (ABIDE I) dataset, achieving an accuracy of 91.62% and an area under the curve value of 95.74%. These results demonstrated its outstanding performance compared to other ASD diagnostic algorithms.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 4","pages":"879-893"},"PeriodicalIF":8.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12340","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141017242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MSO-DETR: Metric space optimization for few-shot object detection MSO-DETR:用于少镜头物体检测的度量空间优化
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-05-02 DOI: 10.1049/cit2.12342
Haifeng Sima, Manyang Wang, Lanlan Liu, Yudong Zhang, Junding Sun
{"title":"MSO-DETR: Metric space optimization for few-shot object detection","authors":"Haifeng Sima,&nbsp;Manyang Wang,&nbsp;Lanlan Liu,&nbsp;Yudong Zhang,&nbsp;Junding Sun","doi":"10.1049/cit2.12342","DOIUrl":"10.1049/cit2.12342","url":null,"abstract":"<p>In the metric-based meta-learning detection model, the distribution of training samples in the metric space has great influence on the detection performance, and this influence is usually ignored by traditional meta-detectors. In addition, the design of metric space might be interfered with by the background noise of training samples. To tackle these issues, we propose a metric space optimisation method based on hyperbolic geometry attention and class-agnostic activation maps. First, the geometric properties of hyperbolic spaces to establish a structured metric space are used. A variety of feature samples of different classes are embedded into the hyperbolic space with extremely low distortion. This metric space is more suitable for representing tree-like structures between categories for image scene analysis. Meanwhile, a novel similarity measure function based on Poincaré distance is proposed to evaluate the distance of various types of objects in the feature space. In addition, the class-agnostic activation maps (CCAMs) are employed to re-calibrate the weight of foreground feature information and suppress background information. Finally, the decoder processes the high-level feature information as the decoding of the query object and detects objects by predicting their locations and corresponding task encodings. Experimental evaluation is conducted on Pascal VOC and MS COCO datasets. The experiment results show that the effectiveness of the authors’ method surpasses the performance baseline of the excellent few-shot detection models.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 6","pages":"1515-1533"},"PeriodicalIF":8.4,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12342","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141021835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian attention-based user behaviour modelling for click-through rate prediction 基于贝叶斯注意力的用户行为建模,用于预测点击率
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-05-01 DOI: 10.1049/cit2.12343
Yihao Zhang, Mian Chen, Ruizhen Chen, Chu Zhao, Meng Yuan, Zhu Sun
{"title":"Bayesian attention-based user behaviour modelling for click-through rate prediction","authors":"Yihao Zhang,&nbsp;Mian Chen,&nbsp;Ruizhen Chen,&nbsp;Chu Zhao,&nbsp;Meng Yuan,&nbsp;Zhu Sun","doi":"10.1049/cit2.12343","DOIUrl":"10.1049/cit2.12343","url":null,"abstract":"<p>Exploiting the hierarchical dependence behind user behaviour is critical for click-through rate (CRT) prediction in recommender systems. Existing methods apply attention mechanisms to obtain the weights of items; however, the authors argue that deterministic attention mechanisms cannot capture the hierarchical dependence between user behaviours because they treat each user behaviour as an independent individual and cannot accurately express users' flexible and changeable interests. To tackle this issue, the authors introduce the Bayesian attention to the CTR prediction model, which treats attention weights as data-dependent local random variables and learns their distribution by approximating their posterior distribution. Specifically, the prior knowledge is constructed into the attention weight distribution, and then the posterior inference is utilised to capture the implicit and flexible user intentions. Extensive experiments on public datasets demonstrate that our algorithm outperforms state-of-the-art algorithms. Empirical evidence shows that random attention weights can predict user intentions better than deterministic ones.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 5","pages":"1320-1330"},"PeriodicalIF":8.4,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12343","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141053198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MH-HMR: Human mesh recovery from monocular images via multi-hypothesis learning MH-HMR:通过多假设学习从单目图像中恢复人体网状结构
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-04-29 DOI: 10.1049/cit2.12337
Haibiao Xuan, Jinsong Zhang, Yu-Kun Lai, Kun Li
{"title":"MH-HMR: Human mesh recovery from monocular images via multi-hypothesis learning","authors":"Haibiao Xuan,&nbsp;Jinsong Zhang,&nbsp;Yu-Kun Lai,&nbsp;Kun Li","doi":"10.1049/cit2.12337","DOIUrl":"https://doi.org/10.1049/cit2.12337","url":null,"abstract":"<p>Recovering 3D human meshes from monocular images is an inherently ill-posed and challenging task due to depth ambiguity, joint occlusion, and truncation. However, most existing approaches do not model such uncertainties, typically yielding a single reconstruction for one input. In contrast, the ambiguity of the reconstruction is embraced and the problem is considered as an inverse problem for which multiple feasible solutions exist. To address these issues, the authors propose a multi-hypothesis approach, multi-hypothesis human mesh recovery (MH-HMR), to efficiently model the multi-hypothesis representation and build strong relationships among the hypothetical features. Specifically, the task is decomposed into three stages: (1) generating a reasonable set of initial recovery results (i.e., multiple hypotheses) given a single colour image; (2) modelling intra-hypothesis refinement to enhance every single-hypothesis feature; and (3) establishing inter-hypothesis communication and regressing the final human meshes. Meanwhile, the authors take further advantage of multiple hypotheses and the recovery process to achieve human mesh recovery from multiple uncalibrated views. Compared with state-of-the-art methods, the MH-HMR approach achieves superior performance and recovers more accurate human meshes on challenging benchmark datasets, such as Human3.6M and 3DPW, while demonstrating the effectiveness across a variety of settings. The code will be publicly available at https://cic.tju.edu.cn/faculty/likun/projects/MH-HMR.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 5","pages":"1263-1274"},"PeriodicalIF":8.4,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12337","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142561699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Safety control strategy of spinal lamina cutting based on force and cutting depth signals 基于力和切割深度信号的脊柱薄片切割安全控制策略
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-04-26 DOI: 10.1049/cit2.12341
Jian Zhang, Yonghong Zhang, Shanshan Liu, Xuquan Ji, Sizhuo Liu, Zhuofu Li, Baoduo Geng, Weishi Li, Tianmiao Wang
{"title":"Safety control strategy of spinal lamina cutting based on force and cutting depth signals","authors":"Jian Zhang,&nbsp;Yonghong Zhang,&nbsp;Shanshan Liu,&nbsp;Xuquan Ji,&nbsp;Sizhuo Liu,&nbsp;Zhuofu Li,&nbsp;Baoduo Geng,&nbsp;Weishi Li,&nbsp;Tianmiao Wang","doi":"10.1049/cit2.12341","DOIUrl":"https://doi.org/10.1049/cit2.12341","url":null,"abstract":"<p>Laminectomy is one of the most common posterior spinal operations. Since the lamina is adjacent to important tissues such as nerves, once damaged, it can cause serious complications and even lead to paralysis. In order to prevent the above injuries and complications, ultrasonic bone scalpel and surgical robots have been introduced into spinal laminectomy, and many scholars have studied the recognition method of the bone tissue status. Currently, almost all methods to achieve recognition of bone tissue are based on sensor signals collected by high-precision sensors installed at the end of surgical robots. However, the previous methods could not accurately identify the state of spinal bone tissue. Innovatively, the identification of bone tissue status was regarded as a time series classification task, and the classification algorithm LSTM-FCN was used to process fusion signals composed of force and cutting depth signals, thus achieving an accurate classification of the lamina bone tissue status. In addition, it was verified that the accuracy of the proposed method could reach 98.85% in identifying the state of porcine spinal laminectomy. And the maximum penetration distance can be controlled within 0.6 mm, which is safe and can be used in practice.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 4","pages":"894-902"},"PeriodicalIF":8.4,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12341","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142007196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sequential selection and calibration of video frames for 3D outdoor scene reconstruction 用于三维室外场景重建的视频帧序列选择与校准
IF 8.4 2区 计算机科学
CAAI Transactions on Intelligence Technology Pub Date : 2024-04-25 DOI: 10.1049/cit2.12338
Weilin Sun, Manyi Li, Peng Li, Xiao Cao, Xiangxu Meng, Lei Meng
{"title":"Sequential selection and calibration of video frames for 3D outdoor scene reconstruction","authors":"Weilin Sun,&nbsp;Manyi Li,&nbsp;Peng Li,&nbsp;Xiao Cao,&nbsp;Xiangxu Meng,&nbsp;Lei Meng","doi":"10.1049/cit2.12338","DOIUrl":"10.1049/cit2.12338","url":null,"abstract":"<p>3D scene understanding and reconstruction aims to obtain a concise scene representation from images and reconstruct the complete scene, including the scene layout, objects bounding boxes and shapes. Existing holistic scene understanding methods primarily recover scenes from single images, with a focus on indoor scenes. Due to the complexity of real-world, the information provided by a single image is limited, resulting in issues such as object occlusion and omission. Furthermore, captured data from outdoor scenes exhibits characteristics of sparsity, strong temporal dependencies and a lack of annotations. Consequently, the task of understanding and reconstructing outdoor scenes is highly challenging. The authors propose a sparse multi-view images-based 3D scene reconstruction framework (SMSR). It divides the scene reconstruction task into three stages: initial prediction, refinement, and fusion stage. The first two stages extract 3D scene representations from each viewpoint, while the final stage involves selection, calibration and fusion of object positions and orientations across different viewpoints. SMSR effectively address the issue of object omission by utilizing small-scale sequential scene information. Experimental results on the general outdoor scene dataset UrbanScene3D-Art Sci and our proprietary dataset Software College Aerial Time-series Images, demonstrate that SMSR achieves superior performance in the scene understanding and reconstruction.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 6","pages":"1500-1514"},"PeriodicalIF":8.4,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12338","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140656659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信