Applied Intelligence最新文献

筛选
英文 中文
One-bit distributed compressed sensing with partial gaussian circulant matrices 部分高斯循环矩阵的位分布压缩感知
IF 3.4 2区 计算机科学
Applied Intelligence Pub Date : 2025-05-06 DOI: 10.1007/s10489-025-06599-8
Yuke Leng, Jingyao Hou, Xinling Liu, Jianjun Wang
{"title":"One-bit distributed compressed sensing with partial gaussian circulant matrices","authors":"Yuke Leng,&nbsp;Jingyao Hou,&nbsp;Xinling Liu,&nbsp;Jianjun Wang","doi":"10.1007/s10489-025-06599-8","DOIUrl":"10.1007/s10489-025-06599-8","url":null,"abstract":"<div><p>One-bit distributed compressed sensing has been widely used in multi-node networks and many other fields. Conventional approaches often employ random Gaussian measurement matrices, but these unstructured matrices demand significant memory and computational resources. To address this limitation, we propose the use of structured partial Gaussian circulant matrices. This kind of matrix facilitates faster matrix operations and permits low storage, making it more practical. To the best of our knowledge, we are the first to theoretically prove that these matrices satisfy the <span>(ell _1/ell _{2,1})</span>-RIP in one-bit distributed compressed sensing. We prove that the required number of measurements under partial Gaussian circulant measurements enjoys the same order with that of Gaussian, which, however, is more computational efficient. Furthermore, numerical experiments confirm that partial Gaussian circulant matrices and random Gaussian matrices exhibit comparable reconstruction performance. Additionally, partial Gaussian circulant matrices spend less recovery time, offering higher computational efficiency.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143908860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Selective output smoothing regularization: Regularize neural networks by softening output distributions 选择性输出平滑正则化:通过软化输出分布来正则化神经网络
IF 3.4 2区 计算机科学
Applied Intelligence Pub Date : 2025-05-06 DOI: 10.1007/s10489-025-06539-6
Xuan Cheng, Tianshu Xie, Xiaomin Wang, Meiyi Yang, Jiali Deng, Minghui Liu, Ming Liu
{"title":"Selective output smoothing regularization: Regularize neural networks by softening output distributions","authors":"Xuan Cheng,&nbsp;Tianshu Xie,&nbsp;Xiaomin Wang,&nbsp;Meiyi Yang,&nbsp;Jiali Deng,&nbsp;Minghui Liu,&nbsp;Ming Liu","doi":"10.1007/s10489-025-06539-6","DOIUrl":"10.1007/s10489-025-06539-6","url":null,"abstract":"<div><p>Convolutional neural networks (CNNs) often exhibit overfitting due to overconfident predictions, which limits the effective utilization of training samples. Inspired by the diverse effects of training from different samples, we propose selective output smoothing regularization(SOSR) that improves model performance by encouraging the generation of equal logits on incorrect classes when handling samples that are correctly and overconfidently classified. This plug-and-play approach integrates seamlessly into diverse CNN architectures without altering their core design. SOSR demonstrates consistent improvements on various benchmarks, such as a 1.1% accuracy gain on ImageNet with ResNet-50 (77.30%). It synergizes effectively with several widely used techniques, such as CutMix and label smoothing, achieving incremental benefits, highlighting its potential as a foundational tool in advancing deep learning applications. Overall, SOSR effectively alleviates underutilization of high-confidence samples, enhances the generalizability of CNNs, and emerges as a robust tool for improving deep learning applications.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-025-06539-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143908855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-decoupling inter-correction multitemporal framework for high-, medium-, and low-resolution optical remote sensing image reconstruction 用于高、中、低分辨率光学遥感图像重建的双解耦互校正多时相框架
IF 3.4 2区 计算机科学
Applied Intelligence Pub Date : 2025-05-06 DOI: 10.1007/s10489-025-06522-1
Weiling Liu, Changqing Huang, Yonghua Jiang, Jingyin Wang, Guo Zhang, Huaibo Song, Xinghua Li
{"title":"Dual-decoupling inter-correction multitemporal framework for high-, medium-, and low-resolution optical remote sensing image reconstruction","authors":"Weiling Liu,&nbsp;Changqing Huang,&nbsp;Yonghua Jiang,&nbsp;Jingyin Wang,&nbsp;Guo Zhang,&nbsp;Huaibo Song,&nbsp;Xinghua Li","doi":"10.1007/s10489-025-06522-1","DOIUrl":"10.1007/s10489-025-06522-1","url":null,"abstract":"<div><p>Reconstructing missing information due to cloud occlusion is an effective means of enhancing the utilization of low-, medium-, and high-resolution optical remote sensing images. However, singletemporal-based methods have limitations regarding the demand for cloud-free reference data and the applicability of specific datadriven models to real-world scenarios. It is more unable to realize mutitemporal reconstruction. To address this, we propose the Dual-Decoupling Inter-correction Multitemporal Reconstruction network (DDIM-RecNet), a unified framework designed for single- and multitemporal cloud occlusion reconstruction of low-, medium-, and high-resolution images. DDIM-RecNet innovatively decouples remote sensing images into ground object and imaging environment components using dedicated inter-correction modules, coupled with targeted loss functions. Additionally, an imaging environment enhancement module ensures spatial consistency between reconstructed and original regions. Compared with classical models, such as U-Net, RFR-Net, STGAN, PSTCR, BSN, GLDF-RecNet, and IDF-CR, DDIM-RecNet achieved excellent visual reconstruction results and the best quantitative evaluation indicators under Gaofen-1 (2 m), Sentinel-2 (10 m), Landsat-8 (30 m) single/multitemporal images. Taking Gaofen-1 (2 m) as an example, compared with the suboptimal model, the clarity of the DDIM-RecNet model in the three bands was improved by 0.44, 0.70, and 0.85 respectively under singletemporal reconstruction; the clarity of DDIM-RecNet was improved by 0.55, 0.43, and 0.35 respectively under mutitemporal cloud occlusion.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-025-06522-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143908817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep RGB-guided generative network for unsupervised hyperspectral image super-resolution 无监督高光谱图像超分辨率的深度rgb引导生成网络
IF 3.4 2区 计算机科学
Applied Intelligence Pub Date : 2025-05-05 DOI: 10.1007/s10489-025-06595-y
Xian-Hua Han, Zhe Liu
{"title":"Deep RGB-guided generative network for unsupervised hyperspectral image super-resolution","authors":"Xian-Hua Han,&nbsp;Zhe Liu","doi":"10.1007/s10489-025-06595-y","DOIUrl":"10.1007/s10489-025-06595-y","url":null,"abstract":"<div><p>Hyperspectral image (HSI) super-resolution (SR) aims to mathematically generate a high spatial resolution hyperspectral (HR-HS) image by merging the degraded observations: a low spatial resolution hyperspectral (LR-HS) image and a high spatial resolution multispectral or RGB (HR-MS/RGB) image. Currently, deep convolution network-based paradigms have been extensively explored to automatically learn the inherent priors of the latent HR-HS images and have shown remarkable performance progress. However, existing methods usually are realized in a fully supervised manner and have to previously prepare a large external dataset containing the degraded observations: the LR-HS/HR-RGB image and its corresponding HR-HS ground truth, which are difficult to collect, especially in the HSI SR scenario. To this end, this study proposes a novel unsupervised HSI SR method by using only the observed degradation data without any other external sample. Specifically, we use a deep RGB-guided generative network to generate the target HR-HS image with an encoder-decoder-based network. Since the observed HR-RGB image has a more detailed spatial structure, which may have better compatibility with the 2D convolution operation, we take the observed HR-RGB image as the network input to serve as the conditional guidance, while using the degraded observations to construct the loss function to guide the network learning. Experimental results on several benchmark HS image datasets demonstrate that the proposed unsupervised method achieves superior performance over various SoTA paradigms.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143904687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quadruped robot locomotion via soft actor-critic with muti-head critic and dynamic policy gradient 基于动态策略梯度和多头软评价的四足机器人运动研究
IF 3.4 2区 计算机科学
Applied Intelligence Pub Date : 2025-05-05 DOI: 10.1007/s10489-025-06584-1
Yanan Fan, Zhongcai Pei, Hongbing Shi, Meng Li, Tianyuan Guo, Zhiyong Tang
{"title":"Quadruped robot locomotion via soft actor-critic with muti-head critic and dynamic policy gradient","authors":"Yanan Fan,&nbsp;Zhongcai Pei,&nbsp;Hongbing Shi,&nbsp;Meng Li,&nbsp;Tianyuan Guo,&nbsp;Zhiyong Tang","doi":"10.1007/s10489-025-06584-1","DOIUrl":"10.1007/s10489-025-06584-1","url":null,"abstract":"<div><p>Quadruped robots’ nonlinear complexity makes traditional modeling challenging, while deep reinforcement learning (DRL) learns effectively through direct environmental interaction without explicit kinematic and dynamic models, becoming an efficient approach for quadruped locomotion across diverse terrains. Conventional reinforcement learning methods typically combine multiple reward criteria into a single scalar function, limiting information representation and complicating the balance between multiple control objectives. We propose a novel multi-head critic and dynamic policy gradient SAC (MHD-SAC) algorithm, innovatively combining a multi-head critic architecture that independently evaluates distinct reward components and a dynamic policy gradient method that adaptively adjusts weights based on current performance. Through simulations on both flat and uneven terrains comparing three approaches (Soft Actor-Critic (SAC), multi-head critic SAC (MH-SAC), and MHD-SAC), we demonstrate that the MHD-SAC algorithm achieves significantly faster learning convergence and higher cumulative rewards than conventional methods. Performance analysis across different reward components reveals MHD-SAC’s superior ability to balance multiple objectives. The results validate that our approach effectively addresses the challenges of multi-objective optimization in quadruped locomotion control, providing a promising foundation for developing more versatile and robust legged robots capable of traversing complex environments.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143908760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-numbered singular matrix transformation for non-invertible and cancelable biometric templates 不可可逆可消生物特征模板的实数奇异矩阵变换
IF 3.4 2区 计算机科学
Applied Intelligence Pub Date : 2025-05-05 DOI: 10.1007/s10489-025-06534-x
Onkar Singh, Ajay Jaiswal, Naveen Kumar
{"title":"Real-numbered singular matrix transformation for non-invertible and cancelable biometric templates","authors":"Onkar Singh,&nbsp;Ajay Jaiswal,&nbsp;Naveen Kumar","doi":"10.1007/s10489-025-06534-x","DOIUrl":"10.1007/s10489-025-06534-x","url":null,"abstract":"<div><p>Cancelable biometrics mitigate privacy and security concerns in biometric-based user authentication by transforming biometric data into non-invertible templates. However, achieving non-invertibility often comes at the cost of reduced discriminability. This paper presents RP-SmXOR, a novel approach for generating cancelable biometric templates, leveraging person-specific real-numbered singular matrices for non-invertible transformation. By combining random permutation, Bitwise-XOR, and the Hadamard product, RP-SmXOR retains and enhances the discriminative information in the templates while addressing the privacy and security concerns associated with traditional biometric authentication. The proposed method was extensively evaluated on seven diverse biometric databases, demonstrating superior performance compared to state-of-the-art random permutation-based techniques. A thorough privacy and security analysis, including brute-force, false acceptance, Attack via Record Multiplicity (ARM), and inverse attacks, along with similarity metrics, confirms the non-invertibility, security, and robustness of the generated templates. Thus, RP-SmXOR adheres to the key principles of cancelable biometrics while significantly improving recognition accuracy and establishing it as a promising solution for secure biometric authentication.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143904688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive unsupervised deep learning denoising for medical imaging with unbiased estimation and Hessian-based regularization 基于无偏估计和hessian正则化的医学图像自适应无监督深度学习去噪
IF 3.4 2区 计算机科学
Applied Intelligence Pub Date : 2025-05-05 DOI: 10.1007/s10489-025-06591-2
Cheng Zhang, Kin Sam Yen
{"title":"Adaptive unsupervised deep learning denoising for medical imaging with unbiased estimation and Hessian-based regularization","authors":"Cheng Zhang,&nbsp;Kin Sam Yen","doi":"10.1007/s10489-025-06591-2","DOIUrl":"10.1007/s10489-025-06591-2","url":null,"abstract":"<div><p>This paper introduces an adaptive, unsupervised deep learning model for denoising Gaussian noise in Magnetic Resonance Imaging (MRI) images. The model is combined with Deep Image Prior (DIP) and Stein's Unbiased Risk Estimate (SURE) and incorporates a regularization term based on the Frobenius norm of the Hessian matrix. Leveraging the SURE criterion, the observed noisy image is used as the network input, significantly accelerating convergence speed and achieving more than a tenfold improvement over DIP. The real-time, adaptive adjustment of regularization intensity, driven by SURE, ensures robust performance across varying noise levels while effectively balancing the preservation of fine image details with noise elimination. The Hessian-based regularization captures second-order variations, promoting smoothness while preserving critical structural details. Experimental results demonstrate the model's superiority, with an average 8.7% increase in PSNR and a 10.1% increase in SSIM compared to DIP achieved. Furthermore, by the 25<i>th</i> iteration, the SSIM value of the proposed method had already surpassed the peak value reached by DIP at the 700<i>th</i> iteration and by DIP variants at the 2000<i>th</i> iteration. These advantages, coupled with the adaptive regularization strength adjustment, demonstrate the model's potential to enhance diagnostic accuracy and efficiency in medical applications.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143904686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Small sample pipeline DR defect detection based on smooth variational autoencoder and enhanced detection head faster RCNN 基于光滑变分自编码器和增强检测头快速RCNN的小样本管道DR缺陷检测
IF 3.4 2区 计算机科学
Applied Intelligence Pub Date : 2025-05-03 DOI: 10.1007/s10489-025-06590-3
Ting Zhang, Tianyang You, Zhaoying Liu, Sadaqat Ur Rehman, Yanan Shi, Amr Munshi
{"title":"Small sample pipeline DR defect detection based on smooth variational autoencoder and enhanced detection head faster RCNN","authors":"Ting Zhang,&nbsp;Tianyang You,&nbsp;Zhaoying Liu,&nbsp;Sadaqat Ur Rehman,&nbsp;Yanan Shi,&nbsp;Amr Munshi","doi":"10.1007/s10489-025-06590-3","DOIUrl":"10.1007/s10489-025-06590-3","url":null,"abstract":"<div><p>The safe operation of gas pipelines is crucial for the safety of residents’ lives and property. However, accurately detecting defects within these gas pipelines is a challenging task. To improve the accuracy of defect detection in pipeline DR images with small sample sizes, we propose an enhanced Faster RCNN model based on a Smooth Variational Autoencoder and Enhanced Detection Head (S-EDH-Faster RCNN). This model leverages a smooth variational autoencoder to reconstruct features and enhances classification scores through an improved detection head, thereby boosting overall detection accuracy. In detail, to address the issue of scarce training samples for new categories, we design a smooth variational autoencoder to reconstruct features that better fit the distribution of training data. Furthermore, to refine classification precision, we present an enhanced detection head that incorporates a convolutional block attention-based center point classification calibration module, which strengthens classification-related portions of the RoI features and adjusts classification scores accordingly. Finally, to effectively learn characteristics of novel class samples, we introduce an adaptive fine-tuning method that adaptively updates key convolutional kernels during the fine-tuning stage, enabling the model to generalize better to novel classes. Experimental results demonstrate that our approach achieves superior detection performance over state-of-the-art models on both the home-made PIP-DET dataset and the publicly available NEU-DET dataset, demonstrating its effectiveness.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-025-06590-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep metric learning-based side-channel analysis with improved robustness and efficiency 基于深度度量学习的边信道分析,增强了鲁棒性和效率
IF 3.4 2区 计算机科学
Applied Intelligence Pub Date : 2025-05-02 DOI: 10.1007/s10489-025-06586-z
Kaibin Li, Yihuai Liang, Hua Meng, Zhengchun Zhou
{"title":"Deep metric learning-based side-channel analysis with improved robustness and efficiency","authors":"Kaibin Li,&nbsp;Yihuai Liang,&nbsp;Hua Meng,&nbsp;Zhengchun Zhou","doi":"10.1007/s10489-025-06586-z","DOIUrl":"10.1007/s10489-025-06586-z","url":null,"abstract":"<div><p>Side-channel analysis (SCA) is one of the widely studied approaches for assessing vulnerabilities in cryptographic algorithm implementations. Existing deep learning (DL)-based SCA approaches are commonly dataset-specific, and their attack performance heavily depends on optimal hyperparameters and effective neural network architectures. Searching such hyperparameters and architectures could be very time-consuming. In addition, traditional machine learning (ML)-based SCA methods often require manual feature engineering, leading to information loss and limiting attack performance. To address these challenges, we propose a profiled SCA model based on deep metric learning (DML) with template attacks (TA). This novel approach improves dataset generalization, enhances feature extraction, and reduces the reliance on hyperparameters. Specifically, a normalized lifted structured (NLS) loss is designed for the proposed attack model. Then, a label-informed hybrid distance is subtly integrated into the model to enhance the model’s ability for capturing relationships between embeddings and labels, thereby improving the attack performance and robustness. Next, a similarity learning method is designed by evaluating all pairwise distances within a mini-batch, reducing sensitivity to triplet selection and improving training efficiency. Experimental results show that the proposed model significantly outperforms the state-of-the-art DL-based SCA methods. It achieves attack performance improvements of up to 50.0% and an average improvement of 37.9% on public datasets, while being 30.8% faster in network training. Comprehensive evaluations show that the proposed model provides high efficiency, robust performance, and strong generalization across diverse datasets and leakage models.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VFE: A large-scale video future event description dataset for evaluating video temporal prediction VFE:用于评估视频时间预测的大规模视频未来事件描述数据集
IF 3.4 2区 计算机科学
Applied Intelligence Pub Date : 2025-05-02 DOI: 10.1007/s10489-025-06547-6
Chenghang Lai, Haibo Wang
{"title":"VFE: A large-scale video future event description dataset for evaluating video temporal prediction","authors":"Chenghang Lai,&nbsp;Haibo Wang","doi":"10.1007/s10489-025-06547-6","DOIUrl":"10.1007/s10489-025-06547-6","url":null,"abstract":"<div><p>Given a video, humans can predict subsequent events in the video and generate reasonable descriptions based on the acquired information and prior knowledge. This ability requires in-depth analysis of dynamic visual information in videos and the comprehensive use of extensive world knowledge for logical reasoning and prediction. However, current visual systems have not yet reached a satisfactory level regarding similar temporal prediction capability. To evaluate this new application, we construct a dataset called VFE (Video Future Event Description), a large-scale dataset for subsequent video event prediction. The VFE dataset contains over 84K video clips, and each clip is equipped with a video and description of the premise event and a predicted description of the subsequent events. To evaluate video temporal prediction, we propose a task, video future event prediction, to generate possible future event descriptions for subsequent unseen video clips based on the premise video. In this paper, we also propose a baseline model for evaluating the VFE dataset. The experimental results indicate the challenge of this task, and the ability of the visual system in complex video temporal prediction needs to be further explored. The dataset and code are available at https://github.com/keyancaigou/VFE.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信