Visual Computing for Industry Biomedicine and Art最新文献

筛选
英文 中文
MEDI-SLATE: medical imaging slide-lecture aligned teaching ensemble. 医学影像幻灯片-讲座相结合的教学组合。
IF 6 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2026-04-14 DOI: 10.1186/s42492-026-00218-0
Motaleb Hossen Manik, Zabirul Islam, Ge Wang
{"title":"MEDI-SLATE: medical imaging slide-lecture aligned teaching ensemble.","authors":"Motaleb Hossen Manik, Zabirul Islam, Ge Wang","doi":"10.1186/s42492-026-00218-0","DOIUrl":"10.1186/s42492-026-00218-0","url":null,"abstract":"<p><p>Slide-based lectures remain the primary means by which undergraduate students learn about the mathematical, physical, and systems-level foundations of medical imaging. However, despite their central educational role, no openly available dataset pairs imaging lecture slides with clean, well-aligned explanatory narration suitable for scientific and educational research. The authors introduced MEDI-SLATE: medical imaging slide-lecture aligned teaching ensemble, constructed from a complete undergraduate biomedical engineering medical imaging course. The dataset contains 1117 high-resolution slides paired with refined narration derived from classroom audio through automatic speech recognition, followed by careful manual cleanup. MEDI-SLATE encompasses linear systems, Fourier analysis, signal processing, X-ray physics, computed tomography, positron emission tomography/single photon emission computed tomography, magnetic resonance imaging , ultrasound, and optical imaging. In addition to the slide-text pairs, the dataset includes lecture-level difficulty tags, key ideas, common student misunderstandings, and practice questions sourced directly from the instructor's materials. A fully reproducible preprocessing pipeline covering slide extraction, narration refinement, alignment, and corpus-level analyses is provided. MEDI-SLATE offers a high-fidelity, openly available resource for medical imaging education, curriculum development, multimodal learning research, and creation of artificial intelligence-assisted instructional tools, with all data and codes released for transparent use and future extension.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"9 1","pages":""},"PeriodicalIF":6.0,"publicationDate":"2026-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13079244/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147677164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Construction of complex non-uniform rational B-spline volume parametric models with G1 continuity. 具有G1连续性的复杂非均匀有理b样条体积参数模型的构造。
IF 6 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2026-04-09 DOI: 10.1186/s42492-026-00217-1
Dan Wang, Long Chen, Jiahong Zhang
{"title":"Construction of complex non-uniform rational B-spline volume parametric models with G<sup>1</sup> continuity.","authors":"Dan Wang, Long Chen, Jiahong Zhang","doi":"10.1186/s42492-026-00217-1","DOIUrl":"10.1186/s42492-026-00217-1","url":null,"abstract":"<p><p>The construction of complex volumetric parametric models has long been a bottleneck in achieving integrated design and simulation modeling. To enhance model quality and simplify the modeling process, this paper proposes an innovative method for improving the continuity of complex volumetric parametric models. First, depending on whether the input consists of design parameters or surface models, a <math> <mrow><msup><mi>G</mi> <mn>0</mn></msup> <mo>/</mo> <msup><mi>C</mi> <mn>0</mn></msup> </mrow> </math> -continuous volumetric parametric model is generated using either a creation or recreation approach. A new data structure is also developed to store the control-point indices and their topological relationships. Next, based on the actual connections among different patches in the volumetric models generated using the two aforementioned methods, the continuity conditions for three different scenarios are formulated. For each scenario, the corresponding systems of equations are established to determine the control points. Subsequently, an algorithm is developed to automatically sort, store, and adjust the relevant control points so that the volumetric parametric model satisfies the G<sup>1</sup> continuity condition. The generated examples demonstrate that the volumetric parametric modeling method proposed in this study is effective for constructing complex models, significantly improving the model quality and rendering them more suitable for subsequent analysis and processing.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"9 1","pages":""},"PeriodicalIF":6.0,"publicationDate":"2026-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13066061/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147640055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review of electroencephalography and electromyography research in robotics: opportunities and challenges. 机器人脑电图与肌电图研究综述:机遇与挑战。
IF 6 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2026-03-20 DOI: 10.1186/s42492-026-00216-2
Zefeng Wang, Meiyan Xu, Junfeng Yao, Yue Yu, Bingbing Hu, Yufei Wang, Yu Wang, Xiaopeng Zhang
{"title":"Review of electroencephalography and electromyography research in robotics: opportunities and challenges.","authors":"Zefeng Wang, Meiyan Xu, Junfeng Yao, Yue Yu, Bingbing Hu, Yufei Wang, Yu Wang, Xiaopeng Zhang","doi":"10.1186/s42492-026-00216-2","DOIUrl":"10.1186/s42492-026-00216-2","url":null,"abstract":"<p><p>In the evolving nexus of neuroscience and robotics, the symbiotic fusion of electroencephalography (EEG) and electromyography (EMG) is emerging as a paradigm-shifting avenue for enhancing human-machine interfaces. While EEG, which captures the subtle electrical nuances of the brain, offers a potent channel for nuanced brain-machine communication, EMG serves as a bridge, converting neuromuscular intentions into actionable directives for robotic apparatuses. This review highlights the current methodologies in which EEG and EMG not only function in silos but also converge harmoniously to dictate robotic control. By delving deeper into this, the intricate synergy between cognitive processes, muscular responses, and machine actions can be unraveled. Subsequently, the discourse also navigates through the myriad challenges encountered in realizing real-time, seamless integration of these bio-signals with robotics and the innovative solutions poised to address them. The aim is to provide a comprehensive understanding of the interplay between neuroscience and robotics. This insight will help drive breakthroughs in adaptive human-machine collaboration.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"9 1","pages":""},"PeriodicalIF":6.0,"publicationDate":"2026-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13003060/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147487626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UMIAD-EGMF: unsupervised medical image anomaly detection based on edge guidance and multi-scale flow fusion. UMIAD-EGMF:基于边缘引导和多尺度流融合的无监督医学图像异常检测。
IF 6 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2026-03-02 DOI: 10.1186/s42492-026-00215-3
Zhirong Li, Guangfeng Lin, Dou Zhang, Rongxin Huang, Jing Yang
{"title":"UMIAD-EGMF: unsupervised medical image anomaly detection based on edge guidance and multi-scale flow fusion.","authors":"Zhirong Li, Guangfeng Lin, Dou Zhang, Rongxin Huang, Jing Yang","doi":"10.1186/s42492-026-00215-3","DOIUrl":"10.1186/s42492-026-00215-3","url":null,"abstract":"<p><p>Medical imaging technology has advanced rapidly in recent years; however, abnormalities in medical images are often rare and complex, making sample labels difficult to obtain for supervised learning of detection models. Existing unsupervised anomaly detection methods, which are the mainstream approaches, often struggle with issues such as blurred edges and varying scales of abnormal regions. To address these issues, a novel unsupervised method for medical image anomaly detection is proposed: unsupervised medical image anomaly detection based on edge guidance and multi-scale flow fusion (UMIAD-EGMF). This method excavates rich edge information with scale adaptation and progressively identifies discriminative information for anomaly detection. UMIAD-EGMF captures contextual information around anomaly boundaries via low-level feature fusion (enhancing boundary details with the edge guidance module; EGM), integrates EGM-extracted edge information into deeper features using the edge aggregation module, and merges multi-scale feature maps to capture common anomaly features (subtle and significant) through multi-scale flow fusion. Experiments on breast ultrasound images (BUSI), brain magnetic resonance imaging (brain MRI), and head computed tomography (head CT) datasets demonstrate that UMIAD-EGMF outperforms the state-of-the-art methods. Specifically, on the BUSI dataset, the segmentation area under the precision-recall curve for object localization (AUPRO) of UMIAD-EGMF reaches 63.36%, surpassing that of the multi-scale low-level feature enhancement U-Net (MLFEU-net) by 0.01%; on the brain MRI dataset, its segmentation AUPRO is 90.83%, outperforming that of MLFEU-net by 0.33%; and on the head CT dataset, its segmentation AUPRO is 62.24%, exceeding that of MedMAE by 2.37%.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"9 1","pages":""},"PeriodicalIF":6.0,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12950837/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147327266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MetaChest: generalized few-shot learning of pathologies from chest X-rays. MetaChest:从胸部x光片中泛化的少量病理学习。
IF 6 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2026-02-06 DOI: 10.1186/s42492-026-00214-4
Berenice Montalvo-Lezama, Gibran Fuentes-Pineda
{"title":"MetaChest: generalized few-shot learning of pathologies from chest X-rays.","authors":"Berenice Montalvo-Lezama, Gibran Fuentes-Pineda","doi":"10.1186/s42492-026-00214-4","DOIUrl":"10.1186/s42492-026-00214-4","url":null,"abstract":"<p><p>The limited availability of annotated data presents a major challenge in applying deep learning methods to medical image analysis. Few-shot learning methods aim to recognize new classes from only a few labeled examples. These methods are typically investigated within a standard few-shot learning paradigm, in which all classes in a task are new. However, medical applications, such as pathology classification from chest X-rays, often require learning new classes while simultaneously leveraging the knowledge of previously known ones, a scenario more closely aligned with generalized few-shot classification. Despite its practical relevance, few-shot learning has rarely been investigated in this context. This study presents MetaChest, a large-scale dataset of 479,215 chest X-rays collected from four public databases. It includes a meta-set partition specifically designed for standard few-shot classification, as well as an algorithm for generating multi-label episodes. Extensive experiments were conducted to evaluate both the standard transfer learning (TL) approach and an extension of ProtoNet across a wide range of few-shot multi-label classification tasks. The results indicate that increasing the number of classes per episode and the number of training examples per class improves the classification performance. Notably, the TL approach consistently outperformed the ProtoNet extension, even though it was not specifically tailored for few-shot learning. Furthermore, higher-resolution images improved the accuracy at the cost of additional computation, whereas efficient model architectures achieved performances comparable to larger models with significantly reduced resource requirements.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"9 1","pages":"4"},"PeriodicalIF":6.0,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12876522/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advances in photoacoustic imaging reconstruction and quantitative analysis for biomedical applications. 生物医学领域光声成像重建与定量分析研究进展。
IF 6 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2026-02-01 DOI: 10.1186/s42492-025-00213-x
Lei Wang, Weiming Zeng, Kai Long, Hongyu Chen, Rongfeng Lan, Li Liu, Wai Ting Siok, Nizhuan Wang
{"title":"Advances in photoacoustic imaging reconstruction and quantitative analysis for biomedical applications.","authors":"Lei Wang, Weiming Zeng, Kai Long, Hongyu Chen, Rongfeng Lan, Li Liu, Wai Ting Siok, Nizhuan Wang","doi":"10.1186/s42492-025-00213-x","DOIUrl":"10.1186/s42492-025-00213-x","url":null,"abstract":"<p><p>Photoacoustic imaging (PAI), a modality that combines the high contrast of optical imaging with the deep penetration of ultrasound, is rapidly transitioning from preclinical research to clinical practice. However, its widespread clinical adoption faces challenges such as the inherent trade-off between penetration depth and spatial resolution, along with the demand for faster imaging speeds. This review comprehensively examines the fundamental principles of PAI, focusing on three primary implementations: photoacoustic computed tomography, photoacoustic microscopy, and photoacoustic endoscopy. It critically analyzes their respective advantages and limitations to provide insights into practical applications. The discussion then extends to recent advancements in image reconstruction and artifact suppression, where both conventional and deep learning (DL)-based approaches have been highlighted for their role in enhancing image quality and streamlining workflows. Furthermore, this work explores progress in quantitative PAI, particularly its ability to precisely measure hemoglobin concentration, oxygen saturation, and other physiological biomarkers. Finally, this review outlines emerging trends and future directions, underscoring the transformative potential of DL in shaping the clinical evolution of PAI.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"9 1","pages":"3"},"PeriodicalIF":6.0,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12860771/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146097413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of an optically emulated computed tomography scanner for college education. 大学教育用光学模拟计算机断层扫描仪的研制。
IF 6 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2026-01-16 DOI: 10.1186/s42492-025-00211-z
Md Motaleb Hossen Manik, William Muldowney, Md Zabirul Islam, Ge Wang
{"title":"Development of an optically emulated computed tomography scanner for college education.","authors":"Md Motaleb Hossen Manik, William Muldowney, Md Zabirul Islam, Ge Wang","doi":"10.1186/s42492-025-00211-z","DOIUrl":"10.1186/s42492-025-00211-z","url":null,"abstract":"<p><p>Computed tomography (CT) is a powerful imaging modality widely used in medicine, research, and industry for noninvasive visualization of internal structures. However, conventional CT systems rely on X-rays, which involve radiation exposure, high equipment costs, and complex regulatory requirements, making them unsuitable for educational or low-resource settings. To address these limitations, we developed a compact, low-cost, optically emulated CT scanner that uses visible light to image semi-transparent specimens. The system consists of a rotating stage enclosed within a light-isolated box, backlight illumination, and a fixed digital single-lens reflex camera. A Teensy 2.0 microcontroller regulates the rotation of the stage, while MATLAB is used to process the captured images using the inverse Radon transform and visualize the reconstructed volume using the Volumetric 3D MATLAB toolbox. Experimental results using a lemon slice demonstrate that the scanner can resolve internal features such as the peel, pulp, and seeds in both 2D and 3D renderings. This system offers a safe and affordable platform for demonstrating CT principles, with potential applications in education, industrial inspection, and visual computing.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"9 1","pages":"2"},"PeriodicalIF":6.0,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12808011/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145991049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial intelligence-aided assignment of journal submissions to associate editors-a feasibility study on IEEE transactions on medical imaging. 人工智能辅助期刊提交给副编辑的分配——IEEE医学成像交易的可行性研究。
IF 6 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2026-01-12 DOI: 10.1186/s42492-025-00212-y
Xuanang Xu, Joshua Yan, Gloria Nwachukwu, Hongming Shan, Uwe Kruger, Ge Wang
{"title":"Artificial intelligence-aided assignment of journal submissions to associate editors-a feasibility study on IEEE transactions on medical imaging.","authors":"Xuanang Xu, Joshua Yan, Gloria Nwachukwu, Hongming Shan, Uwe Kruger, Ge Wang","doi":"10.1186/s42492-025-00212-y","DOIUrl":"10.1186/s42492-025-00212-y","url":null,"abstract":"<p><p>Efficient and accurate assignment of journal submissions to suitable associate editors (AEs) is critical in maintaining review quality and timeliness, particularly in high-volume, rapidly evolving fields such as medical imaging. This study investigates the feasibility of leveraging large language models for AE-paper matching in IEEE Transactions on Medical Imaging. An AE database was curated from historical AE assignments and AE-authored publications, and extracted six key textual components from each paper title, four categories of structured keywords, and abstracts. ModernBERT was employed locally to generate high-dimensional semantic embeddings, which were then reduced using principal component analysis (PCA) for efficient similarity computation. Keyword similarity, derived from structured domain-specific metadata, and textual similarity from ModernBERT embeddings were combined to rank the candidate AEs. Experiments on internal (historical assignments) and external (AE Publications) test sets showed that keyword similarity is the dominant contributor to matching performance. Contrarily, textual similarity offers complementary gains, particularly when PCA is applied. Ablation studies confirmed that structured keywords alone provide strong matching accuracy, with titles offering additional benefits and abstracts offering minimal improvements. The proposed approach offers a practical, interpretable, and scalable tool for editorial workflows, reduces manual workload, and supports high-quality peer reviews.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"9 1","pages":"1"},"PeriodicalIF":6.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12791093/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145953273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Text-to-3D scene generation framework: bridging textual descriptions to high-fidelity 3D scenes. 文本到3D场景生成框架:桥接文本描述到高保真3D场景。
IF 6 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2025-12-18 DOI: 10.1186/s42492-025-00210-0
Zuan Gu, Tianhan Gao, Huimin Liu
{"title":"Text-to-3D scene generation framework: bridging textual descriptions to high-fidelity 3D scenes.","authors":"Zuan Gu, Tianhan Gao, Huimin Liu","doi":"10.1186/s42492-025-00210-0","DOIUrl":"10.1186/s42492-025-00210-0","url":null,"abstract":"<p><p>Text-to-3D scene generation is pivotal for digital content creation; however, existing methods often struggle with global consistency across views. We present 3DS-Gen, a modular \"generate-then-reconstruct\" framework that first produces a temporally coherent multi-view video prior and then reconstructs consistent 3D scenes using sparse geometry estimation and Gaussian optimization. A cascaded variational autoencoder (2D for spatial compression and 3D for temporal compression) provides a compact and coherent latent sequence that facilitates robust reconstruction. An adaptive density threshold improves detailed allocation in the Gaussian stage under a fixed computational budget. While explicit meshes can be extracted from the optimized representation when needed, our claims emphasize multiview consistency and reconstructability; the mesh quality depends on the video prior and the chosen explicitification backend. 3DS-Gen runs on a single GPU and yields coherent scene reconstructions across diverse prompts, thereby providing a practical bridge between text and 3D content creation.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"8 1","pages":"29"},"PeriodicalIF":6.0,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12712286/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145775790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced temporal encoding-decoding for survival analysis of multimodal clinical data in smart healthcare. 智能医疗中多模式临床数据生存分析的增强时间编码解码。
IF 6 4区 计算机科学
Visual Computing for Industry Biomedicine and Art Pub Date : 2025-12-12 DOI: 10.1186/s42492-025-00209-7
Xiaofeng Zhang, Zijie Pan, Yuhang Tian, Lili Wang, Tingting Xu, Li Chen, Xiangyun Liao, Tianyu Jiang
{"title":"Enhanced temporal encoding-decoding for survival analysis of multimodal clinical data in smart healthcare.","authors":"Xiaofeng Zhang, Zijie Pan, Yuhang Tian, Lili Wang, Tingting Xu, Li Chen, Xiangyun Liao, Tianyu Jiang","doi":"10.1186/s42492-025-00209-7","DOIUrl":"10.1186/s42492-025-00209-7","url":null,"abstract":"<p><p>Effective survival analysis is essential for identifying optimal preventive treatments within smart healthcare systems and leveraging digital health advancements; however, existing prediction models face limitations, primarily relying on ensemble classification techniques with suboptimal performance in both target detection and predictive accuracy. To address these gaps, this paper proposes a multimodal framework that integrates enhanced facial feature detection and temporal predictive modeling. For facial feature extraction, this study developed a lightweight face-region convolutional neural network (FRegNet) specialized in detecting key facial components, such as eyes and lips in clinical patients that incorporates a residual backbone (Rstem) to enhance feature representation and a facial path aggregated feature pyramid network for multi-resolution feature fusion; comparative experiments reveal that FRegNet outperforms state-of-the-art target detection algorithms, achieving average precision (AP) of 0.922, average recall of 0.933, mean average precision (mAP) of 0.987, and precision of 0.98-significantly surpassing other mask region-based convolutional neural networks (RCNN) variants, such as mask RCNN-ResNeXt with AP of 0.789 and mAP of 0.957. Based on the extracted facial features and clinical physiological indicators, this study proposes an enhanced temporal encoding-decoding (ETED) model that integrates an adaptive attention mechanism and a gated weighting mechanism to improve predictive performance, with comparative results demonstrating that the ETED variant incorporating facial features (ETEncoding-Decoding-Face) outperforms traditional models, achieving an accuracy of 0.916, precision of 0.850, recall of 0.895, F1 of 0.884, and area under the curve (AUC) of 0.947-outperforming gradient boosting with an accuracy of 0.922, but AUC of 0.669, and other classifiers in comprehensive metrics. The results confirm that the multimodal dataset (facial features + physiological indicators) significantly enhances the prediction accuracy of the seven-day survival conditions of patients. Correlation analysis reveals that chronic health evaluation and mean arterial pressure are positively correlated with survival, while temperature, Glasgow Coma Scale, and fibrinogen are negatively correlated.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"8 1","pages":"28"},"PeriodicalIF":6.0,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12701133/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145744924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书