Pengxiao Xu, Junyan Lyu, Li Lin, Pujin Cheng, Xiaoying Tang
{"title":"LF-SynthSeg: Label-Free Brain Tissue-Assisted Tumor Synthesis and Segmentation.","authors":"Pengxiao Xu, Junyan Lyu, Li Lin, Pujin Cheng, Xiaoying Tang","doi":"10.1109/JBHI.2024.3489721","DOIUrl":"10.1109/JBHI.2024.3489721","url":null,"abstract":"<p><p>Unsupervised brain tumor segmentation is pivotal in realms of disease diagnosis, surgical planning, and treatment response monitoring, with the distinct advantage of obviating the need for labeled data. Traditional methodologies in this domain, however, often fall short in fully capitalizing on the extensive prior knowledge of brain tissue, typically approaching the task merely as an anomaly detection challenge. In our research, we present an innovative strategy that effectively integrates brain tissues' prior knowledge into both the synthesis and segmentation of brain tumor from T2-weighted Magnetic Resonance Imaging scans. Central to our method is the tumor synthesis mechanism, employing randomly generated ellipsoids in conjunction with the intensity profiles of brain tissues. This methodology not only fosters a significant degree of variation in the tumor presentations within the synthesized images but also facilitates the creation of an essentially unlimited pool of abnormal T2-weighted images. These synthetic images closely replicate the characteristics of real tumor-bearing scans. Our training protocol extends beyond mere tumor segmentation; it also encompasses the segmentation of brain tissues, thereby directing the networkâs attention to the boundary relationship between brain tumor and brain tissue, thus improving the robustness of our method. We evaluate our approach across five widely recognized public datasets (BRATS 2019, BRATS 2020, BRATS 2021, PED and SSA), and the results show that our method outperforms state-of-the-art unsupervised tumor segmentation methods by large margins. Moreover, the proposed method achieves more than 92 % of the fully supervised performance on the same testing datasets.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":null,"pages":null},"PeriodicalIF":6.7,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142557726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Forecasting Epidemic Spread with Recurrent Graph Gate Fusion Transformers.","authors":"Minkyoung Kim, Jae Heon Kim, Beakcheol Jang","doi":"10.1109/JBHI.2024.3488274","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3488274","url":null,"abstract":"<p><p>Predicting the unprecedented, nonlinear nature of COVID-19 presents a significant public health challenge. Recent advances in deep learning, such as Graph Neural Networks, Recurrent Neural Networks (RNNs), and Transformers, have enhanced predictions by modeling regional interactions, managing autoregressive time series, and identifying long-term dependencies. However, prior works often feature shallow integration of these models, leading to simplistic graph embeddings and inadequate analysis across different graph types. Additionally, excessive reliance on historical COVID-19 data limits the potential of utilizing time-lagged data, such as intervention policy information. To address these challenges, we introduce ReGraFT, a novel Sequence-to-Sequence model designed for robust long-term forecasting of COVID-19. ReGraFT integrates Multigraph-Gated Recurrent Units (MGRUs) with adaptive graphs, leveraging data from individual states, including infection rates, policy changes, and interstate travel. First, ReGraFT employs adaptive MGRU cells within an RNN framework to capture inter-regional dependencies, dynamically modeling complex transmission dynamics. Second, the model features a Self-Normalizing Priming layer using SELUs to enhance stability and accuracy across short, medium, and long-term forecasts. Lastly, ReGraFT systematically compares and integrates various graph types derived from fully connected layers, pooling, and attention-based mechanisms to provide a nuanced representation of inter-regional relationships. By incorporating lagged COVID-19 policy data, ReGraFT refines forecasts, demonstrating RMSE reductions of 2.39-35.92% compared to state-of-the-art models. This work provides accurate long-term predictions, aiding in better public health decisions. Our code is available at https://github.com/mfriendly/ReGraFT.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":null,"pages":null},"PeriodicalIF":6.7,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142545158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fang Peng, Hongkuan Shi, Shiquan He, Qiang Hu, Ting Li, Fan Huang, Xinxia Feng, Mei Liu, Jiazhi Liao, Qiang Li, Zhiwei Wang
{"title":"Fine-Grained Temporal Site Monitoring in EGD Streams Via Visual Time-Aware Embedding and Vision-Text Asymmetric Coworking.","authors":"Fang Peng, Hongkuan Shi, Shiquan He, Qiang Hu, Ting Li, Fan Huang, Xinxia Feng, Mei Liu, Jiazhi Liao, Qiang Li, Zhiwei Wang","doi":"10.1109/JBHI.2024.3488514","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3488514","url":null,"abstract":"<p><p>Esophagogastroduodenoscopy (EGD) requires inspecting plentiful upper gastrointestinal (UGI) sites completely for a precise cancer screening. Automated temporal site monitoring for EGD assistance is thus of high demand, yet often fails if directly applying the existing methods of online action detection. The key challenges are two- fold: 1) the global camera motion dominates, invalidating the temporal patterns derived from the object optical flows, and 2) the UGI sites are fine-grained, yielding highly homogenized appearances. In this paper, we propose an EGD-customized model, powered by two novel designs, i.e., Visual Time-aware Embedding plus Vision-text Asymmetric Coworking (VTE+VAC), for real-time accurate fine-grained UGI site monitoring. Concretely, VTE learns visual embeddings by differentiating frames via classification losses, and meanwhile by reordering the sampled time-agnostic frames to be temporally coherent via a ranking loss. Such joint objective encourages VTE to capture the sequential relation without resorting to the inapplicable object optical flows, and thus to provide the time-aware frame- wise embeddings. In the subsequent analysis, VAC uses a temporal sliding window, and extracts vision-text multimodal knowledge from each frame and its corresponding textualized prediction via the learned VTE and a frozen BERT. The text embeddings help provide more representative cues, but also may cause misdirection due to prediction errors. Thus, VAC randomly drops or replaces historical predictions to increase the error tolerance to avoid collapsing onto the last few predictions. Qualitative and quantitative experiments demonstrate that the proposed method achieves superior performance compared to other state-of-the-art methods, with an average F1-score improvement of at least 7.66%.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":null,"pages":null},"PeriodicalIF":6.7,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142545157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qi Teng, Wei Li, Guangwei Hu, Yuanyuan Shu, Yun Liu
{"title":"Innovative Dual-Decoupling CNN with Layer-wise Temporal-Spatial Attention for Sensor-Based Human Activity Recognition.","authors":"Qi Teng, Wei Li, Guangwei Hu, Yuanyuan Shu, Yun Liu","doi":"10.1109/JBHI.2024.3488528","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3488528","url":null,"abstract":"<p><p>Human Activity Recognition (HAR) is essential for monitoring and analyzing human behavior, particularly in health applications such as fall detection and chronic disease management. Traditional methods, even those incorporating attention mechanisms, often oversimplify the complex temporal and spatial dependencies in sensor data by processing features uniformly, leading to inadequate modeling of high-dimensional interactions. To address these limitations, we propose a novel framework: the Temporal-Spatial Feature Decoupling Unit with Layer-wise Training Convolutional Neural Network (CNN-TSFDU-LW). Our model enhances HAR accuracy by decoupling temporal and spatial dependencies, facilitating more precise feature extraction and reducing computational overhead. The TSFDU mechanism enables parallel processing of temporal and spatial features, thereby enriching the learned representations. Furthermore, layer-wise training with a local error function allows for independent updates of each CNN layer, reducing the number of parameters and improving memory efficiency without compromising performance. Experiments on four benchmark datasets (UCI-HAR, PAMAP2, UNIMIB-SHAR, and USC-HAD) demonstrate accuracy improvements ranging from 0.9% to 4.19% over state-of-the-art methods while simultaneously reducing computational complexity. Specifically, our framework achieves accuracy rates of 97.90% on UCI-HAR, 94.34% on PAMAP2, 78.90% on UNIMIB-SHAR, and 94.71% on USC-HAD, underscoring its effectiveness in complex HAR tasks. In conclusion, the CNN-TSFDU-LW framework represents a significant advancement in sensor-based HAR, delivering both improved accuracy and computational efficiency, with promising potential for enhancing health monitoring applications.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":null,"pages":null},"PeriodicalIF":6.7,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142545168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-omics Graph Knowledge Representation for Pneumonia Prognostic Prediction.","authors":"Wenyu Xing, Miao Li, Yiwen Liu, Xin Liu, Yifang Li, Yanping Yang, Jing Bi, Jiangang Chen, Dongni Hou, Yuanlin Song, Dean Ta","doi":"10.1109/JBHI.2024.3488735","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3488735","url":null,"abstract":"<p><p>Early prognostic prediction is crucial for determining appropriate clinical interventions. Previous single-omics models had limitations, such as high contingency and overlooking complex physical conditions. In this paper, we introduced multi-omics graph knowledge representation to predict in-hospital outcomes for pneumonia patients. This method utilizes CT imaging and three non-imaging omics information, and explores a knowledge graph for modeling multi-omics relations to enhance the overall information representation. For imaging omics, a multichannel pyramidal recursive MLP and Longformer-based 3D deep learning module was developed to extract depth features in lung window, while radiomics features were simultaneously extracted in both lung and mediastinal windows. Non-imaging omics involved the adoption of laboratory, microbial, and clinical indices to complement the patient's physical condition. Following feature screening, the similarity fusion network and graph convolutional network (GCN) were employed to determine omics similarity and provide prognostic prediction. The results of comparative experiments and generalization validation demonstrat that the proposed multi-omics GCN-based prediction model has good robustness and outperformed previous single-type omics, classical machine learning, and previous deep learning methods. Thus, the proposed multi-omics graph knowledge representation model enhances early prognostic prediction performance in pneumonia, facilitating a comprehensive assessment of disease severity and timely intervention for high-risk patients.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":null,"pages":null},"PeriodicalIF":6.7,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142545169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic Identification of Facial Tics Using Selfie-Video.","authors":"Yocheved Loewenstern, Noa Benaroya-Milshtein, Katya Belelovsky, Izhar Bar-Gad","doi":"10.1109/JBHI.2024.3488285","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3488285","url":null,"abstract":"<p><p>The intrinsic nature of tic disorders, characterized by symptom variability and fluctuation, poses challenges in clinical evaluations. Currently, tic assessments predominantly rely on subjective questionnaires administered periodically during clinical visits, thus lacking continuous quantitative evaluation. This study aims to establish an automatic objective measure of tic expression in natural behavioral settings. A custom-developed smartphone application was used to record selfie-videos of children and adolescents with tic disorders exhibiting facial motor tics. Facial landmarks were utilized to extract tic-related features from video segments labeled as either \"tic\" or \"non-tic\". These features were then passed through a tandem of custom deep neural networks to learn spatial and temporal properties for tic classification of these segments according to their labels. The model achieved a mean accuracy of 95% when trained on data across all subjects, and consistently exceeded 90% accuracy in leave-one-session-out and leave-one-subject-out cross validation training schemes. This automatic tic identification measure may provide a valuable tool for clinicians in facilitating diagnosis, patient follow-up, and treatment efficacy evaluation. Combining this measure with standard smartphone technology has the potential to revolutionize large-scale clinical studies, thereby expediting the development and testing of novel interventions.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":null,"pages":null},"PeriodicalIF":6.7,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142545155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jahui Pan, Yangzuyi Yu, Man Li, Wanxin Wei, Shuyu Chen, Heyi Zheng, Yanbin He, Yuanqing Li
{"title":"A Multimodal Consistency-Based Self-Supervised Contrastive Learning Framework for Automated Sleep Staging in Patients with Disorders of Consciousness.","authors":"Jahui Pan, Yangzuyi Yu, Man Li, Wanxin Wei, Shuyu Chen, Heyi Zheng, Yanbin He, Yuanqing Li","doi":"10.1109/JBHI.2024.3487657","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3487657","url":null,"abstract":"<p><p>Sleep is a fundamental human activity, and automated sleep staging holds considerable investigational potential. Despite numerous deep learning methods proposed for sleep staging that exhibit notable performance, several challenges remain unresolved, including inadequate representation and generalization capabilities, limitations in multimodal feature extraction, the scarcity of labeled data, and the restricted practical application for patients with disorder of consciousness (DOC). This paper proposes MultiConsSleepNet, a multimodal consistency-based sleep staging network. This network comprises a unimodal feature extractor and a multimodal consistency feature extractor, aiming to explore universal representations of electroencephalograms (EEGs) and electrooculograms (EOGs) and extract the consistency of intra- and intermodal features. Additionally, self-supervised contrastive learning strategies are designed for unimodal and multimodal consistency learning to address the current situation in clinical practice where it is difficult to obtain high-quality labeled data but has a huge amount of unlabeled data. It can effectively alleviate the model's dependence on labeled data, and improve the model's generalizability for effective migration to DOC patients. Experimental results on three publicly available datasets demonstrate that MultiConsSleepNet achieves state-of-the-art performance in sleep staging with limited labeled data and effectively utilizes unlabeled data, enhancing its practical applicability. Furthermore, the proposed model yields promising results on a self-collected DOC dataset, offering a novel perspective for sleep staging research in patients with DOC.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":null,"pages":null},"PeriodicalIF":6.7,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142545153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eleni Vasileiou, Sofia B Dias, Stelios Hadjidimitriou, Vasilis Charisis, Nikolaos Karagkiozidis, Stavros Malakoudis, Patty de Groot, Stelios Andreadis, Vassilis Tsekouras, Georgios Apostolidis, Anastasia Matonaki, Thanos G Stavropoulos, Leontios J Hadjileontiadis
{"title":"Novel Digital Biomarkers for Fine Motor Skills Assessment in Psoriatic Arthritis: The DaktylAct Touch-based Serious Game Approach.","authors":"Eleni Vasileiou, Sofia B Dias, Stelios Hadjidimitriou, Vasilis Charisis, Nikolaos Karagkiozidis, Stavros Malakoudis, Patty de Groot, Stelios Andreadis, Vassilis Tsekouras, Georgios Apostolidis, Anastasia Matonaki, Thanos G Stavropoulos, Leontios J Hadjileontiadis","doi":"10.1109/JBHI.2024.3487785","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3487785","url":null,"abstract":"<p><p>Psoriatic Arthritis (PsA) is a chronic, inflammatory disease affecting joints, substantially impacting patients' quality of life, with European guidelines for managing PsA emphasizing the importance of assessing hand function. Here, we present a set of novel digital biomarkers (dBMs) derived from a touchscreen-based serious game approach, DaktylAct, intended as a proxy, gamified, objective assessment of hand impairment, with emphasis on fine motor skills, caused by PsA. This is achieved by its design, where the user controls a cannon to aim at and hit targets using two finger pinch-in/out and wrist rotation gestures. In-game metrics (targets hit and score) and statistical features (mean, standard deviation) of gameplay actions (duration of gestures, applied pressure, and wrist rotation angle) produced during gameplay serve as informative dBMs. DaktylAct was tested on a cohort comprising 16 clinically verified PsA patients and nine healthy controls (HC). Correlation analysis demonstrated a positive correlation between average pinch-in duration and disease activity (DA) and a negative correlation between standard deviation of applied pressure during wrist rotation and joint inflammation. Logistic regression models achieved 83% and 91% classification performance discriminating HC from PsA patients with low DA (LDA) and PsA patients with and without joint inflammation, respectively. Results presented here are promising and create a proof-of-concept, paving the way for further validation in larger cohorts.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":null,"pages":null},"PeriodicalIF":6.7,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142545170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fangrong Zong, Zaimin Zhu, Jiayi Zhang, Xiaofeng Deng, Zhuangzhuang Li, Chuyang Ye, Yong Liu
{"title":"Attention-based q-space Deep Learning Generalized for Accelerated Diffusion Magnetic Resonance Imaging.","authors":"Fangrong Zong, Zaimin Zhu, Jiayi Zhang, Xiaofeng Deng, Zhuangzhuang Li, Chuyang Ye, Yong Liu","doi":"10.1109/JBHI.2024.3487755","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3487755","url":null,"abstract":"<p><p>Diffusion magnetic resonance imaging (dMRI) is a non-invasive method for capturing the microanatomical information of tissues by measuring the diffusion weighted signals along multiple directions, which is widely used in the quantification of microstructures. Obtaining microscopic parameters requires dense sampling in the q space, leading to significant time consumption. The most popular approach to accelerating dMRI acquisition is to undersample the q-space data, along with applying deep learning methods to reconstruct quantitative diffusion parameters. However, the reliance on a predetermined q-space sampling strategy often constrains traditional deep learning-based reconstructions. The present study proposed a novel deep learning model, named attention-based q-space deep learning (aqDL), to implement the reconstruction with variable q-space sampling strategies. The aqDL maps dMRI data from different scanning strategies onto a common feature space by using a series of Transformer encoders. The latent features are employed to reconstruct dMRI parameters via a multilayer perceptron. The performance of the aqDL model was assessed utilizing the Human Connectome Project datasets at varying undersampling numbers. To validate its generalizability, the model was further tested on two additional independent datasets. Our results showed that aqDL consistently achieves the highest reconstruction accuracy at various undersampling numbers, regardless of whether variable or predetermined q-space scanning strategies are employed. These findings suggest that aqDL has the potential to be used on general clinical dMRI datasets.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":null,"pages":null},"PeriodicalIF":6.7,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142545154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Avatar-Based Picture Exchange Communication System Enhancing Joint Attention Training for Children With Autism.","authors":"Yongjun Ren, Runze Liu, Huinan Sang, Xiaofeng Yu","doi":"10.1109/JBHI.2024.3487589","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3487589","url":null,"abstract":"<p><p>Children with Autism Spectrum Disorder (ASD) often struggle with social communication and feel anxious in interactive situations. The Picture Exchange Communication System (PECS) is commonly used to enhance basic communication skills in children with ASD, but it falls short in reducing social anxiety during therapist interactions and in keeping children engaged. This paper proposes the use of virtual character technology alongside PECS training to address these issues. By integrating a virtual avatar, children's communication skills and ability to express needs can be gradually improved. This approach also reduces anxiety and enhances the interactivity and attractiveness of the training. After conducting a T-test, it was found that PECS assisted by a virtual avatar significantly improves children's focus on activities and enhances their behavioral responsiveness. To address the problem of poor accuracy of gaze estimation in unconstrained environments, this study further developed a visual feature-based gaze estimation algorithm, the three-channel gaze network (TCG-Net). It utilizes binocular images to refine the gaze direction and infer the primary focus from facial images. Our focus was on enhancing gaze tracking accuracy in natural environments, crucial for evaluating and improving Joint Attention (JA) in children during interactive processes.TCG-Net achieved an angular error of 4.0 on the MPIIGaze dataset, 5.0 on the EyeDiap dataset, and 6.8 on the RT-Gene dataset, confirming the effectiveness of our approach in improving gaze accuracy and the quality of social interactions.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":null,"pages":null},"PeriodicalIF":6.7,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142545156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}