IEEE Journal of Biomedical and Health Informatics最新文献

筛选
英文 中文
Towards Reliable Prediction: A Bayesian Continual Learning Approach for Clinical Time-series Data. 迈向可靠预测:临床时间序列数据的贝叶斯持续学习方法。
IF 6.8 2区 医学
IEEE Journal of Biomedical and Health Informatics Pub Date : 2025-09-01 DOI: 10.1109/JBHI.2025.3598718
Cao Zhen, Jeanette Poh Wen Jun, Yang Guo, Chandan Gautam, Mila Nambiar, Sing Yi Chia, Nur Nasyitah Mohamed Salim, Sheldon Lee, Hong Choon Oh, Yong Mong Bee, Pavitra Krishnaswamy, Savitha Ramasamy
{"title":"Towards Reliable Prediction: A Bayesian Continual Learning Approach for Clinical Time-series Data.","authors":"Cao Zhen, Jeanette Poh Wen Jun, Yang Guo, Chandan Gautam, Mila Nambiar, Sing Yi Chia, Nur Nasyitah Mohamed Salim, Sheldon Lee, Hong Choon Oh, Yong Mong Bee, Pavitra Krishnaswamy, Savitha Ramasamy","doi":"10.1109/JBHI.2025.3598718","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3598718","url":null,"abstract":"<p><p>Deep learning models are increasingly used for making predictions based on clinical time series data, but model generalization remains a challenge. Continual learning approaches, which preserve representations while learning new distributions, are suitable for addressing this challenge. We propose Continual Bayesian Long Short Term Memory (C-BLSTM), a continual learning algorithm based on the Bayesian LSTM model for domain incremental learning. C-BLSTM continually learns a sequence of tasks by combining architectural pruning, variational inference-based regularization, and coreset replay strategies. In extensive experiments on two public electronic medical record datasets for mortality prediction, we show that C-BLSTM outperforms many state-of-the-art continual learning approaches. Further, we apply the C-BLSTM to two realworld clinical time series datasets for prediction of readmission risk in patients with heart failure and glycated haemoglobin outcomes in patients with type 2 diabetes.First, we show that these datasets exhibit domain incremental characteristics with significant drifts in their marginal distributions and moderate drifts in their conditional distributions. Then, we demonstrate that the C-BLSTM improves generalization in five diverse realworld scenarios spanning temporal, site, device, case mix, and ethnicity shifts, both in terms of performance and reliability of predictions.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Continuous Cuffless Blood Pressure Estimation via Effective and Efficient Broad Learning Model. 基于有效和高效的广义学习模型的连续无袖带血压测量。
IF 6.8 2区 医学
IEEE Journal of Biomedical and Health Informatics Pub Date : 2025-09-01 DOI: 10.1109/JBHI.2025.3604464
Chunlin Zhang, Pingyu Hu, Zhan Shen, Xiaorong Ding
{"title":"Continuous Cuffless Blood Pressure Estimation via Effective and Efficient Broad Learning Model.","authors":"Chunlin Zhang, Pingyu Hu, Zhan Shen, Xiaorong Ding","doi":"10.1109/JBHI.2025.3604464","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3604464","url":null,"abstract":"<p><p>Hypertension is a major risk factor for cardiovascular diseases and all-cause mortality, making accessible and easy blood pressure (BP) measurement, such as cuffless methods, crucial for its prevention, detection, and management. Cuffless BP estimation using wearable cardiovascular signals via deep learning models (DLMs) offers a promising solution. However, implementation of the DLMs usually requires high computational cost and time. This study addresses these challenges by offering an end-to-end broad learning model (BLM) for effective and efficient cuffless BP estimation. The BLM increases network width rather than depth compared to DLMs, reducing computational complexity and improving training efficiency for continuous BP estimation. We also explore an incremental learning mode that provides high memory efficiency and flexibility. Validation of the proposed method on the University of California Irvine (UCI) database, which spanned 403.67 hours, demonstrated that the standard BLM (SBLM) achieves a mean absolute error (MAE) of 11.72 mmHg for the estimation of the arterial BP (ABP) waveform, which was comparable to the performance to DLMs, such as long short-term memory (LSTM) and the one-dimensional convolutional neural network (1D-CNN), while significantly improving training efficiency by 25.20 times. Furthermore, incremental BLM (IBLM) provides a horizontal scalability approach, which involves expanding the model by adding nodes in a single layer rather than increasing the number of layers, for incremental learning, effectively updating the model while maintaining comparable predictive performance. This approach reduces storage demands by supporting incremental learning with streaming or partial datasets. In addition, the mean absolute error (MAE) (mean error (ME) ± standard deviation (SD)) values of the SBLM for predicting systolic BP (SBP) and diastolic BP (DBP) were 3.04 mmHg (2.85 ± 4.15 mmHg) and 2.57 mmHg (-2.47 ± 3.03 mmHg). This study highlights the potential of BLM for personalized, real-time, and continuous cuffless BP monitoring, offering a practical solution for healthcare applications.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Med3DVLM: An Efficient Vision-Language Model for 3D Medical Image Analysis. Med3DVLM:用于三维医学图像分析的高效视觉语言模型。
IF 6.8 2区 医学
IEEE Journal of Biomedical and Health Informatics Pub Date : 2025-09-01 DOI: 10.1109/JBHI.2025.3604595
Yu Xin, Gorkem Can Ates, Kuang Gong, Wei Shao
{"title":"Med3DVLM: An Efficient Vision-Language Model for 3D Medical Image Analysis.","authors":"Yu Xin, Gorkem Can Ates, Kuang Gong, Wei Shao","doi":"10.1109/JBHI.2025.3604595","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3604595","url":null,"abstract":"<p><p>Vision-language models (VLMs) have shown promise in 2D medical image analysis, but extending them to 3D remains challenging due to the high computational demands of volumetric data and the difficulty of aligning 3D spatial features with clinical text. We present Med3DVLM, a 3D VLM designed to address these challenges through three key innovations: (1) DCFormer, an efficient encoder that uses decomposed 3D convolutions to capture fine-grained spatial features at scale; (2) SigLIP, a contrastive learning strategy with pairwise sigmoid loss that improves image-text alignment without relying on large negative batches; and (3) a dual-stream MLP-Mixer projector that fuses low- and high-level image features with text embeddings for richer multi-modal representations. We evaluated our model on the M3D dataset, which includes radiology reports and VQA data for 120,084 3D medical images. The results show that Med3DVLM achieves superior performance on multiple benchmarks. For image-text retrieval, it reaches 61.00% R@1 on 2,000 samples, significantly outperforming the current state-of-the-art M3D-LaMed model (19.10%). For report generation, it achieves a METEOR score of 36.42% (vs. 14.38%). In open-ended visual question answering (VQA), it scores 36.76% METEOR (vs. 33.58%), and in closed-ended VQA, it achieves 79.95% accuracy (vs. 75.78%). These results demonstrate Med3DVLM's ability to bridge the gap between 3D imaging and language, enabling scalable, multi-task reasoning across clinical applications. Our code is publicly available at https://github.com/mirthAI/Med3DVLM.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel dual-attention deep neural network with multi-scale fusion feature processing for predicting transcription factor binding sites. 基于多尺度融合特征处理的双注意力深度神经网络预测转录因子结合位点。
IF 6.8 2区 医学
IEEE Journal of Biomedical and Health Informatics Pub Date : 2025-09-01 DOI: 10.1109/JBHI.2025.3604625
Yuechuan Dai, Xianjun Shen, Weizhong Zhao, Xiaohua Hu
{"title":"A novel dual-attention deep neural network with multi-scale fusion feature processing for predicting transcription factor binding sites.","authors":"Yuechuan Dai, Xianjun Shen, Weizhong Zhao, Xiaohua Hu","doi":"10.1109/JBHI.2025.3604625","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3604625","url":null,"abstract":"<p><p>Transcription functions as a pivotal biological process in cell biology, which is required to complete the binding of transcription factors (TFs) to transcription factor binding sites (TFBSs) on the DNA. Accurate prediction of TFBSs can provide great potential to regulate the expression of interested genes, which can facilitate exploration of new drugs and treatment for diseases. Although many deep learning-based models have been proposed for predicting TFBSs, existing models still have problems, including the use of convolutional processing of DNA sequences that loses information about the DNA double helix structure and fails to adequately account for the stereoscopic structure of DNA shape data in three dimensions. In this paper, we propose a novel model called DeepCTMS, in which both sequence features and shape features of DNA slices are effectively fused to derive high-quality representations for the task of TFBS prediction. A sequence feature processing module is first used to extract the DNA double helix structure features of DNA slices. The three-dimensional features of DNA shape data are extracted by employing a convolutional triple attention (CTA) module for the shape data of a DNA slice. Finally, a multi-scale fusion feature processing (MSFFP) module is used to fuse sequence features and shape features to obtain representations with significantly aligned semantics of both features. Ablation experiments, t-SNE visual analysis, and cross-cell line validation results demonstrate that DeepCTMS consistently outperforms benchmark models on prediction performance and generalization ability on 165 ChIP-seq datasets.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144951965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Pediatric Delirium Recognition via Deep Learning-Powered Video Analysis. 通过深度学习驱动的视频分析自动识别儿童谵妄。
IF 6.8 2区 医学
IEEE Journal of Biomedical and Health Informatics Pub Date : 2025-09-01 DOI: 10.1109/JBHI.2025.3604448
Jiarong Chen, Suqin Xia, Wenqi Shi, Yemin Gong, Yali Huang, Lixiang Gu, Xiaoyu Lin, Haibao Chen, Guoxing Wang, Cheng Chen, Liebin Zhao, Wenyi Luo
{"title":"Automated Pediatric Delirium Recognition via Deep Learning-Powered Video Analysis.","authors":"Jiarong Chen, Suqin Xia, Wenqi Shi, Yemin Gong, Yali Huang, Lixiang Gu, Xiaoyu Lin, Haibao Chen, Guoxing Wang, Cheng Chen, Liebin Zhao, Wenyi Luo","doi":"10.1109/JBHI.2025.3604448","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3604448","url":null,"abstract":"<p><p>Delirium is an acute, fluctuating state of consciousness disturbance characterized by cognitive alterations and perceptual disturbances. Pediatric delirium has a notably higher incidence rate than adult delirium, and it is time-consuming and labor-intensive for clinicians to analyze, requiring effective recognition approaches. Deep learning has shown potential for the extraction of robust representations and improvement of patient outcomes. In this study, 129 video samples labeled by professional clinicians were collected from multiple hospitals, including 74 non-delirium and 55 delirium labeled samples. An 18-layer deep spatiotemporal convolutional neural network is employed, in which two-dimensional and one-dimensional convolutional filters are applied to individual video frames to extract frame-level and inter-frame-level features, respectively. The entire architecture is pretrained on a large-scale video analysis dataset, and a three-layer fully connected classification head is integrated for the delirium recognition task. The proposed model was fine-tuned with a training dataset and evaluated on a testing dataset, exploring various models and strategies. The proposed algorithm demonstrated robust classification performance, achieving an accuracy of 0.8718, precision of 0.8711, recall of 0.8730, and F1-score of 0.8715, with approximately 31.54 million model parameters. These metric results validate the clinical applicability and technical reliability of the model under various training and testing strategies. In addition, the developed delirium classification model is deployed a hospital system to enable intelligent video diagnosis. The independent test accuracy for 100 newly collected samples is 0.8800. Therefore, the proposed algorithm enables new methods for pediatric delirium recognition and cures.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Synthesizing ECG from BCG: a Physiological Semantics Enhanced Multiband Diffusion Generative Approach. 从BCG合成ECG:一种生理语义增强的多波段扩散生成方法。
IF 6.8 2区 医学
IEEE Journal of Biomedical and Health Informatics Pub Date : 2025-09-01 DOI: 10.1109/JBHI.2025.3604560
Jiafeng Qiu, Qinghua Zhang, Gang Shen
{"title":"Synthesizing ECG from BCG: a Physiological Semantics Enhanced Multiband Diffusion Generative Approach.","authors":"Jiafeng Qiu, Qinghua Zhang, Gang Shen","doi":"10.1109/JBHI.2025.3604560","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3604560","url":null,"abstract":"<p><p>The ballistocardiogram (BCG) is an unobtrusive measurement that shows promise for long-term, home-based cardiovascular monitoring and early disease screening. However, the lack of standardized clinical interpretations for BCG waveforms, compared to electrocardiogram (ECG) signals, limits its direct application in diagnostic decision-making. Although the ECG synthesis from BCG provides a viable solution, the significant differences in semantic density and spectral distribution between the two types of signals pose challenges to this process. Here, we propose physiological semantics-enhanced multiband diffusion (PSEM-Diff), a novel method using physiological semantics alignment and a diffusion model to achieve precise translation from BCG to ECG signals. The PSEM-Diff model integrates the prior knowledge of ECG physiological semantics (including the temporal characteristics of P-QRS-T waveform sequences and the correlation between J waves and R waves) into the BCG pre-encoding through the attention distillation and adopts the decoupled multiband diffusion to preserve the precise waveform details across different bands of ECG. We validated the proposed PSEM-Diff using datasets that included healthy individuals and patients with several cardiovascular diseases. The experimental results show that the synthesized ECG has a higher fidelity to the ground truth than other state-of-the-art methods. Further, detection for atrial fibrillation (AFib) and other arrhythmias indicates a diagnostic consistency with the ground-truth ECG, demonstrating the potential of PSEM-Diff for cardiovascular monitoring and telemedicine applications.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Medical Image Privacy in Federated Learning: Segmentation-Reorganization and Sparsified Gradient Matching Attacks. 联邦学习中的医学图像隐私:分割重组和稀疏梯度匹配攻击。
IF 6.8 2区 医学
IEEE Journal of Biomedical and Health Informatics Pub Date : 2025-09-01 DOI: 10.1109/JBHI.2025.3593631
Kaimin Wei, Jin Qian, Chengkun Jia, Jinpeng Chen, Jilian Zhang, Yongdong Wu, Jinyu Zhu, Yuhan Guo
{"title":"Medical Image Privacy in Federated Learning: Segmentation-Reorganization and Sparsified Gradient Matching Attacks.","authors":"Kaimin Wei, Jin Qian, Chengkun Jia, Jinpeng Chen, Jilian Zhang, Yongdong Wu, Jinyu Zhu, Yuhan Guo","doi":"10.1109/JBHI.2025.3593631","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3593631","url":null,"abstract":"<p><p>In modern medicine, the widespread use of medical imaging has greatly improved diagnostic and treatment efficiency. However, these images contain sensitive personal information, and any leakage could seriously compromise patient privacy, leading to ethical and legal issues. Federated learning (FL), an emerging privacy-preserving technique, transmits gradients rather than raw data for model training. Yet, recent studies reveal that gradient inversion attacks can exploit this information to reconstruct private data, posing a significant threat to FL. Current attacks remain limited in image resolution, similarity, and batch processing, and thus do not yet pose a significant risk to FL. To address this, we propose a novel gradient inversion attack based on sparsified gradient matching and segmentation reorganization (SR) to reconstruct high-resolution, high-similarity medical images in batch mode. Specifically, an $L_{1}$ loss function optimises the gradient sparsification process, while the SR strategy enhances image resolution. An adaptive learning rate adjustment mechanism is also employed to improve optimisation stability and avoid local optima. Experimental results demonstrate that our method significantly outperforms state-of-the-art approaches in both visual quality and quantitative metrics, achieving up to a 146% improvement in similarity.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic theta-alpha inter-brain model during mother-preschooler cooperation. 幼儿与母亲合作的动态脑间模型。
IF 6.8 2区 医学
IEEE Journal of Biomedical and Health Informatics Pub Date : 2025-08-29 DOI: 10.1109/JBHI.2025.3603544
Jiayang Xu, Yamin Li, Ruxin Su, Saishuang Wu, Chengcheng Wu, Haiwa Wang, Qi Zhu, Yue Fang, Fan Jiang, Shanbao Tong, Yunting Zhang, Xiaoli Guo
{"title":"Dynamic theta-alpha inter-brain model during mother-preschooler cooperation.","authors":"Jiayang Xu, Yamin Li, Ruxin Su, Saishuang Wu, Chengcheng Wu, Haiwa Wang, Qi Zhu, Yue Fang, Fan Jiang, Shanbao Tong, Yunting Zhang, Xiaoli Guo","doi":"10.1109/JBHI.2025.3603544","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3603544","url":null,"abstract":"<p><p>The interaction between mothers and young children is a highly dynamic process neurally characterized by inter-brain synchrony (IBS) at θ and/or α rhythms. However, their establishment, dynamic changes, and roles in mother-child interactions remain unknown. In this study, through a simultaneous dynamic analysis of inter-brain EEG synchrony, intra-brain EEG power, and interactive behaviors from 40 mother-preschooler dyads during turn-taking cooperation, we constructed a dynamic inter-brain model that θ-IBS and α-IBS alternated with interactive behaviors, with EEG frequency-shift as a prerequisite for IBS transitions. When mothers attempt to track their children's attention and/or predict their intentions, they will adjust their EEG frequencies to align with their children's θ oscillations, leading to a higher occurrence of the θ-IBS state. Conversely, the α-IBS state, accompanied by the EEG frequency-shift to the α range, is more prominent during mother-led interactions. Further exploratory analysis reveals greater presence and stability of the θ-IBS state during cooperative than non-cooperative conditions, particularly in dyads with stronger emotional attachments and more frequent interactions in their daily lives. Our findings shed light on the neural oscillational substrates underlying the IBS dynamics during mother-preschooler interactions.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flow Matching-Based Data Synthesis for Robust Anatomical Landmark Localization. 基于流匹配的数据综合鲁棒解剖地标定位。
IF 6.8 2区 医学
IEEE Journal of Biomedical and Health Informatics Pub Date : 2025-08-29 DOI: 10.1109/JBHI.2025.3603907
Arnela Hadzic, Lea Bogensperger, Andrea Berghold, Martin Urschler
{"title":"Flow Matching-Based Data Synthesis for Robust Anatomical Landmark Localization.","authors":"Arnela Hadzic, Lea Bogensperger, Andrea Berghold, Martin Urschler","doi":"10.1109/JBHI.2025.3603907","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3603907","url":null,"abstract":"<p><p>Anatomical landmark localization (ALL) plays a crucial role in medical imaging for applications such as therapy planning and surgical interventions. State-ofthe- art deep learning methods for ALL are often trained on small datasets due to the scarcity of large, annotated medical data. This constraint often leads to overfitting on the training dataset, which in turn reduces the model's ability to generalize to unseen data. To address these challenges, we propose a multi-channel generative approach utilizing Flow Matching to synthesize diverse annotated images for data augmentation in ALL tasks. Each synthetically generated sample consists of a medical image paired with a multi-channel heatmap that encodes its landmark configuration, from which the corresponding landmark annotations can be derived. We assess the quality of synthetic image-heatmap pairs automatically using a Statistical Shape Model to evaluate landmark plausibility and compute the Fréchet Inception Distance score to quantify image quality. Our results show that pairs synthesized via Flow Matching exhibit superior quality and diversity compared with those generated by other state-of-the-art generative models like Generative Adversarial Networks or diffusion models. Furthermore, we investigate the effect of integrating synthetic data into the training process of an ALL network. In our experiments, the ALL network trained with Flow Matching-generated data demonstrates improved robustness, particularly in scenarios with limited training data or occlusions, compared with baselines that utilize solely real images or synthetic data from alternative generative models.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Scale Temporal-Frequency Attention Network Based on Ocular Imaging for Depression Detection. 基于眼成像的多尺度时频注意网络抑郁症检测。
IF 6.8 2区 医学
IEEE Journal of Biomedical and Health Informatics Pub Date : 2025-08-29 DOI: 10.1109/JBHI.2025.3604064
Ziru Weng, Zilin Guo, Yujie Gao, Weihao Zheng, Yongfeng Tao, Bin Hu, Minqiang Yang
{"title":"Multi-Scale Temporal-Frequency Attention Network Based on Ocular Imaging for Depression Detection.","authors":"Ziru Weng, Zilin Guo, Yujie Gao, Weihao Zheng, Yongfeng Tao, Bin Hu, Minqiang Yang","doi":"10.1109/JBHI.2025.3604064","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3604064","url":null,"abstract":"<p><p>Depression is a common and serious mental disorder, characterized by persistent low mood, loss of interest, cognitive dysfunction, and physiological changes. Patients may experience symptoms such as sleep disturbances, changes in appetite, fatigue, and low self-esteem, with severe cases potentially leading to suicidal behavior. There are differences in emotional processing and attention allocation between patients with depression and healthy controls, eye movement characteristics such as fixation patterns, saccade amplitude, and attentional bias have been used as physiological signals for depression detection. Many researchers have developed depression recognition models based on ocular imaging. However, convolutional neural networks, which utilize local receptive fields, can only capture local features in ocular imaging. This paper proposes Multi-Scale Temporal-Frequency Attention Network (MTFNet), which innovatively integrates Multi-Scale time-frequency domain attention into the Video Swin Transformer. Through Multi-Scale Temporal-Frequency Attention Module (MTFAM), MTFNet learns the most important regions in eye movement images, enabling it to capture features more effectively from sequential data and gain a deeper understanding of the structure within eye movement images. Experimental results show that the proposed method achieves a high accuracy of 76.8% on a self-collected eye movement image dataset, outperforming most models. This work provides a novel approach to research on depression recognition based on eye movement images.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信