Wenhao Rao, Jiayang Guo, Chunran Zhu, Meiyan Xu, Naian Xiao, Yijie Pan, Ling Zhang, Xiaowen Ye, Jun Jiang, Xiaolu Wang, Peipei Gu, Duo Chen
{"title":"Interictal Epileptiform Discharge Detection using Dual-domain Features and GAN.","authors":"Wenhao Rao, Jiayang Guo, Chunran Zhu, Meiyan Xu, Naian Xiao, Yijie Pan, Ling Zhang, Xiaowen Ye, Jun Jiang, Xiaolu Wang, Peipei Gu, Duo Chen","doi":"10.1109/JBHI.2025.3605257","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3605257","url":null,"abstract":"<p><p>Interictal Epileptiform Discharge is essential for identifying epilepsy. However, the unpredictable and non-stationary nature of electroencephalogram (EEG) patterns poses considerable challenges for reliable identification. Manual interpretation of EEG is subjective and time-consuming. With advancements in machine learning and deep learning, computer-aided approaches for automated IED detection have been rapidly developed. The state-of-the-art convolutional neural network (CNN)-based methods have shown promising results but struggle to capture long-term dependencies in time-series data. In contrast, Transformer excels at modeling sequential information through self-attention mechanisms, overcoming the CNN limitations. This study proposes an IED Detector (IEDD) that integrates convolutional layers and a Transformer to detect IEDs. The IEDD initially employs convolutional layers to extract local features of IEDs, followed by a Transformer to model long-term dependencies. To further extract spatial features, EEG data are represented as a three-dimensional tensor with embedded channel topology, where a CNN captures spatial features at each sampling point and a Long Short-Term Memory (LSTM) network models their temporal evolution. Additionally, due to the scarcity of IED data, a novel Transformer-based Generative Adversarial Network (GAN) is developed to augment the IED dataset. Experimental results show the proposed approach achieves an average accuracy of $96.11%$ on the augmented Dataset 1 and $95.25%$ on Dataset 2 for binary classification, with an average sensitivity of $87.26%$ and precision of $89.96%$ for multi-label classification. These findings provide valuable insights into advancing deep learning and Transformer-based approaches for automated IED detection.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145006096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yan Pang, Yunhao Li, Jiaming Liang, Hao Chen, Ying Hu, Qiong Wang
{"title":"SegTom: A 3D Volumetric Medical Image Segmentation Framework for Thoracoabdominal Multi-Organ Anatomical Structures.","authors":"Yan Pang, Yunhao Li, Jiaming Liang, Hao Chen, Ying Hu, Qiong Wang","doi":"10.1109/JBHI.2025.3606266","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3606266","url":null,"abstract":"<p><p>Accurate segmentation of thoracoabdominal anatomical structures in three-dimensional medical imaging modalities is fundamental for informed clinical decision-making across a wide array of medical disciplines. Current approaches often struggle to efficiently and comprehensively process this region's intricate and heterogeneous anatomical information, leading to suboptimal outcomes in diagnosis, treatment planning, and disease management. To address this challenge, we introduce SegTom, a novel volumetric segmentation framework equipped with a cutting-edge SegTom Block specifically engineered to effectively capture the complex anatomical representations inherent to the thoracoabdominal region. This SegTom Block incorporates a hierarchical anatomical-representation decomposition to facilitate efficient information exchange by decomposing the computationally intensive self-attention mechanism and cost-effectively aggregating the extracted representations. Rigorous validation of SegTom across nine diverse datasets, encompassing both computed tomography (CT) and magnetic resonance imaging (MRI) modalities, consistently demonstrates high performance across a broad spectrum of anatomical structures. Specifically, SegTom achieves a mean Dice similarity coefficient (DSC) of 87.29% for cardiac segmentation on the MM-WHS MRI dataset, 83.48% for multi-organ segmentation on the BTCV abdominal CT dataset, and 92.01% for airway segmentation on a dedicated CT dataset. Code: https://github.com/deepang-ai/SegTom.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145006084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sakib Mahmud, Muhammad E H Chowdhury, Faycal Bensaali
{"title":"3D foot kinetics estimation from distributed vGRF from smart insoles via 1D domain transformation.","authors":"Sakib Mahmud, Muhammad E H Chowdhury, Faycal Bensaali","doi":"10.1109/JBHI.2025.3605296","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3605296","url":null,"abstract":"<p><p>Understanding foot kinetics is fundamental to analyzing human locomotion, offering critical insights into mechanical loads exerted on the feet. While vertical ground reaction force (vGRF) is widely used in biomechanics research, comprehensive 3D kinetic measurements, including ground reaction force (GRF), ground reaction moment (GRM), and center of pressure (CoP) along the anterior-posterior and medial-lateral axes, provide deeper insights for various applications. Smart insoles, though portable, cost-effective, and user-friendly, primarily capture vGRF and often generate lower-quality data than force plates and instrumented treadmills. This study leverages deep learning-based domain transformation to generate instrumented treadmill-level 3D-GRF&M-CoP from distributed vGRF signals recorded by smart insoles for healthy subjects. Additionally, a multi-segment analysis is performed to identify the most relevant plantar regions for each kinetic parameter. The proposed approach is rigorously evaluated against treadmill data and benchmarked against state-of-the-art methods, accounting for subject variations and walking speeds. Key contributions include: (1) transforming distributed vGRF into 3D-GRF&M-CoP using 1D sequence-to-sequence (1D-s2s) models, (2) enhancing insole vGRF to treadmill quality, (3) optimizing insole pressure sensor layout for efficient 3D kinetics estimation, and (4) introducing Ke2KeNet, a novel deep learning model that outperforms current 1D-s2s benchmarks.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145006086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nitesh Bharot, Priyanka Verma, Ankit Vidyarthi, Deepak Gupta, John G Breslin
{"title":"Revolutionizing Wearable Sensor Data Analysis With an Automated Decision-Making Model for Enhanced Human Activity Detection.","authors":"Nitesh Bharot, Priyanka Verma, Ankit Vidyarthi, Deepak Gupta, John G Breslin","doi":"10.1109/JBHI.2025.3604710","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3604710","url":null,"abstract":"<p><p>Human Activity Recognition (HAR) stands as a crucial technology, with applications ranging from healthcare monitoring to sports analytics. However, the traditional approach to HAR is often time-consuming and susceptible to human errors due to the high complexities involved in processing diverse sensor data. Recognizing the imperative for efficiency and accuracy in HAR systems, we propose the development of an Automated Decision-maker (ADM) system. This system serves to automate HAR pipelines, addressing the challenges posed by the huge sensor data. By harnessing the power of automation, ADM significantly streamlines the HAR process, reducing the time required for hyperparameter tuning and minimizing the risk of human errors. The results obtained from our proposed ADM system demonstrate notable improvements in HAR performance, showcasing achieved accuracy of 96.436% for UCI-HAR & 99.783% for PAMAP2 datasets. Moreover, ADM can be described as an innovative approach that contributes to the optimization of HAR systems while also establishing a foundation for building robust and reliable systems in complex environments.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Heng, Fiaz Gul Khan, Ma Yinghua, Ahmad Khan, Farman Ali, Nasrullah Khan, Daehan Kwak
{"title":"Integrating GANs, Contrastive Learning, and Transformers for Robust Medical Image Analysis.","authors":"Yang Heng, Fiaz Gul Khan, Ma Yinghua, Ahmad Khan, Farman Ali, Nasrullah Khan, Daehan Kwak","doi":"10.1109/JBHI.2025.3604845","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3604845","url":null,"abstract":"<p><p>Despite the widespread success of convolutional neural networks (CNNs) in general computer vision tasks, their application to complex medical image analysis faces persistent challenges. These include limited labeled data availability, which restricts model generalization; class imbalance, where minority classes are underrepresented and lead to biased predictions; and inadequate feature representation, since conventional CNNs often struggle to capture subtle patterns and intricate dependencies characteristic of medical imaging. To address these limitations, we propose CTNGAN, a unified framework that integrates generative modeling with Generative Adversarial Networks (GANs), contrastive learning, and Transformer architectures to enhance the robustness and accuracy of medical image analysis. Each component is designed to tackle a specific challenge: the GAN model mitigates data scarcity and imbalance, contrastive learning strengthens feature robustness against domain shifts, and the Transformer captures long-range spatial patterns. This tripartite integration not only overcomes the limitations of conventional CNNs but also achieves superior generalizability, as demonstrated by classification experiments on benchmark medical imaging datasets, with up to 98.5% accuracy and an F1-score of 0.968, outperforming existing methods. The framework's ability to jointly optimize data generation, feature discrimination, and contextual modeling establishes a new paradigm for accurate and reliable medical image diagnosis.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FIGNet: A Robust and Interpretable Fuzzy-Irreversible Gated Network for Auditory Brainstem Response Classification.","authors":"Ke Zhang, Chunrui Zhao, Zenan Li, Caiwei Li, Desheng Jia, Yongchao Chen, Shang Yan, Xin Wang, Yishu Teng, Hongguang Pan, Shixiong Chen","doi":"10.1109/JBHI.2025.3604834","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3604834","url":null,"abstract":"<p><p>Auditory brainstem response (ABR) is an important tool for newborn hearing screening and neurological assessment. However, its signals are often difficult to be accurately resolved due to noise interference and weak waveforms, and the need for repeated measurements under multiple sound intensity conditions results in time-consuming data acquisition. Therefore, there is an urgent need to develop an automatic classification model with high accuracy, robustness and good interpretability to achieve stable and effective recognition performance with minimal ABR data. This study presents FIGNet, a new deep learning model that combines type-2 fuzzy logic with a time-irreversible attention mechanism to address uncertainty and temporal direction in ABR signals. Fuzzy attention helps reduce the impact of noise, while the irreversible attention models the one-way nature of neural responses. Experiments on real ABR datasets show that FIGNet outperforms existing models in both binary and five-class classification tasks. It achieves 93.72% accuracy in binary classification and 84.42% accuracy in five-class classification. Visualization results-including confusion matrices, and accuracy curves under different noise levels-further confirm that FIGNet can focus on key waveform areas and stay reliable even in noisy conditions. These findings demonstrate that FIGNet offers fast, interpretable, and robust performance for clinical ABR analysis, achieving high classification accuracy under both clean and noisy conditions.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-supervised Guided Modality Disentangled Representation Learning for Multimodal Sentiment Analysis and Schizophrenia Assessment.","authors":"Hsin-Yang Chang, An-Sheng Liu, Yi-Ting Lin, Chen-Chung Liu, Lue-En Lee, Feng-Yi Chen, Shu-Hui Hung, Li-Chen Fu","doi":"10.1109/JBHI.2025.3604933","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3604933","url":null,"abstract":"<p><p>As the impact of chronic mental disorders increases, multimodal sentiment analysis (MSA) has emerged to improve diagnosis and treatment. In this paper, our approach leverages disentangled representation learning to address modality heterogeneity with self-supervised learning as a guidance. The self-supervised learning is proposed to generate pseudo unimodal labels and guide modality-specific representation learning, preventing the acquisition of meaningless features. Additionally, we also propose a text-centric fusion to effectively mitigate the impacts of noise and redundant information and fuse the acquired disentangled representations into a comprehensive multimodal representation. We evaluate our model on three publicly available benchmark datasets for multimodal sentiment analysis and a privately collected dataset focusing on schizophrenia counseling. The experimental results demonstrate state-of-the-art performance across various metrics on the benchmark datasets, surpassing related works. Furthermore, our learning algorithm shows promising performance in real-world applications, outperforming our previous work and achieving significant progress in schizophrenia assessment.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Med3DVLM: An Efficient Vision-Language Model for 3D Medical Image Analysis.","authors":"Yu Xin, Gorkem Can Ates, Kuang Gong, Wei Shao","doi":"10.1109/JBHI.2025.3604595","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3604595","url":null,"abstract":"<p><p>Vision-language models (VLMs) have shown promise in 2D medical image analysis, but extending them to 3D remains challenging due to the high computational demands of volumetric data and the difficulty of aligning 3D spatial features with clinical text. We present Med3DVLM, a 3D VLM designed to address these challenges through three key innovations: (1) DCFormer, an efficient encoder that uses decomposed 3D convolutions to capture fine-grained spatial features at scale; (2) SigLIP, a contrastive learning strategy with pairwise sigmoid loss that improves image-text alignment without relying on large negative batches; and (3) a dual-stream MLP-Mixer projector that fuses low- and high-level image features with text embeddings for richer multi-modal representations. We evaluated our model on the M3D dataset, which includes radiology reports and VQA data for 120,084 3D medical images. The results show that Med3DVLM achieves superior performance on multiple benchmarks. For image-text retrieval, it reaches 61.00% R@1 on 2,000 samples, significantly outperforming the current state-of-the-art M3D-LaMed model (19.10%). For report generation, it achieves a METEOR score of 36.42% (vs. 14.38%). In open-ended visual question answering (VQA), it scores 36.76% METEOR (vs. 33.58%), and in closed-ended VQA, it achieves 79.95% accuracy (vs. 75.78%). These results demonstrate Med3DVLM's ability to bridge the gap between 3D imaging and language, enabling scalable, multi-task reasoning across clinical applications. Our code is publicly available at https://github.com/mirthAI/Med3DVLM.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic theta-alpha inter-brain model during mother-preschooler cooperation.","authors":"Jiayang Xu, Yamin Li, Ruxin Su, Saishuang Wu, Chengcheng Wu, Haiwa Wang, Qi Zhu, Yue Fang, Fan Jiang, Shanbao Tong, Yunting Zhang, Xiaoli Guo","doi":"10.1109/JBHI.2025.3603544","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3603544","url":null,"abstract":"<p><p>The interaction between mothers and young children is a highly dynamic process neurally characterized by inter-brain synchrony (IBS) at θ and/or α rhythms. However, their establishment, dynamic changes, and roles in mother-child interactions remain unknown. In this study, through a simultaneous dynamic analysis of inter-brain EEG synchrony, intra-brain EEG power, and interactive behaviors from 40 mother-preschooler dyads during turn-taking cooperation, we constructed a dynamic inter-brain model that θ-IBS and α-IBS alternated with interactive behaviors, with EEG frequency-shift as a prerequisite for IBS transitions. When mothers attempt to track their children's attention and/or predict their intentions, they will adjust their EEG frequencies to align with their children's θ oscillations, leading to a higher occurrence of the θ-IBS state. Conversely, the α-IBS state, accompanied by the EEG frequency-shift to the α range, is more prominent during mother-led interactions. Further exploratory analysis reveals greater presence and stability of the θ-IBS state during cooperative than non-cooperative conditions, particularly in dyads with stronger emotional attachments and more frequent interactions in their daily lives. Our findings shed light on the neural oscillational substrates underlying the IBS dynamics during mother-preschooler interactions.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qihan Hu, Daomiao Wang, Hong Wu, Jian Liu, Cuiwei Yang
{"title":"Unleashing the Power of Pretrained Transformer for Dense Prediction in Physiological Signals.","authors":"Qihan Hu, Daomiao Wang, Hong Wu, Jian Liu, Cuiwei Yang","doi":"10.1109/JBHI.2025.3592687","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3592687","url":null,"abstract":"<p><p>The physiological signals obtained from advanced sensors, combined with deep learning techniques for classification and regression tasks, have become a core driving force in enhancing smart healthcare. Recently, dense prediction tasks for physiological signals-aimed at generating predictions that are closely aligned with the input signal to enable fine-grained analysis-have garnered increasing attention. The UNet family, often combined with sophisticated task-specific customizations, has become a popular choice to improve prediction performance. However, pretrained Transformers have recently revolutionized deep learning due to their powerful transferability and effectiveness. In this work, we aim to harness the power of pretrained Transformers for dense prediction, eliminating the need for extensive task-specific architecture design. We propose a simple yet universal encoder-decoder architecture that utilizes a pretrained Transformer encoder and a lightweight convolutional Restormer decoder for dense prediction on physiological signals. To optimize the trade-off between model performance and computational efficiency, we incorporate knowledge distillation (KD). Our experiments focus on four representative dense prediction tasks: blood pressure waveform (BPW) estimation, PPG-to-ECG (P2E) reconstruction, denoising, and fiducial point localization. The results show that our proposed architecture outperforms state-of-the-art models, validating the potential of pretrained Transformers in enhancing physiological signal processing and medical diagnostics. This approach marks a significant step forward in optimizing both the performance and efficiency of dense prediction tasks.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}