Journal of Biomedical Informatics最新文献

筛选
英文 中文
Efficient strabismus diagnosis from small samples: Harnessing spatial features for improved accuracy.
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-12-10 DOI: 10.1016/j.jbi.2024.104759
Renzhong Wu, Shenghui Liao, Yongrong Ji, Xiaoyan Kui, Fuchang Han, Ziyang Hu, Xuefei Song
{"title":"Efficient strabismus diagnosis from small samples: Harnessing spatial features for improved accuracy.","authors":"Renzhong Wu, Shenghui Liao, Yongrong Ji, Xiaoyan Kui, Fuchang Han, Ziyang Hu, Xuefei Song","doi":"10.1016/j.jbi.2024.104759","DOIUrl":"10.1016/j.jbi.2024.104759","url":null,"abstract":"<p><p>Strabismus is a common ophthalmological condition, and early diagnosis is crucial to preventing visual impairment and loss of stereopsis. However, traditional methods for diagnosing strabismus often rely on specialized ophthalmic equipment and trained personnel, limiting the widespread accessibility of strabismus diagnosis. Computer-aided strabismus diagnosis is an effective and widely used technology that assists clinicians in making clinical diagnoses and improving efficiency. To address this, we designed an efficient strabismus diagnosis model, RIS-MLP, based on a small number of samples derived from frontal facial images captured under natural lighting conditions via the Hirschberg test. The RIS-MLP combines light reflex point detection and iris detection modules to accurately extract key spatial features even under noisy and occluded conditions. The optimized spatial feature strategies further enhances the performance of the classification module. To validate the superiority of RIS-MLP, we conducted both direct and indirect comparative experiments. Indirect comparisons demonstrate that the RIS-MLP has advantages in terms of sample efficiency. While direct comparisons show that the RIS-MLP can mitigate overfitting to a certain extent, and the RIS-MLP along with its variants (e.g., RIS-SVM) have outperformed state-of-the-art models on our noisy and imbalanced dataset.</p>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":" ","pages":"104759"},"PeriodicalIF":4.0,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142818178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coherence and comprehensibility: Large language models predict lay understanding of health-related content.
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-12-09 DOI: 10.1016/j.jbi.2024.104758
Trevor Cohen, Weizhe Xu, Yue Guo, Serguei Pakhomov, Gondy Leroy
{"title":"Coherence and comprehensibility: Large language models predict lay understanding of health-related content.","authors":"Trevor Cohen, Weizhe Xu, Yue Guo, Serguei Pakhomov, Gondy Leroy","doi":"10.1016/j.jbi.2024.104758","DOIUrl":"10.1016/j.jbi.2024.104758","url":null,"abstract":"<p><p>Health literacy is a prerequisite to informed health-related decision making. To facilitate understanding of information, text should be presented at an appropriate reading level for the reader. Cognitive studies suggest that the coherence of a text - the interconnectedness between the ideas it expresses - is especially important for low-knowledge readers, who lack the background knowledge to draw inferences from text that is implicitly connected only. Prior work in cognitive science has yielded automated methods to estimate coherence. These methods estimate the proximity between text representations in a semantic vector space, with the underlying idea that units of text that are poorly connected will be further apart in this space. In addition, recent work with large language models (LLMs) has produced probabilistic methodological analogues that have yet to be evaluated for this purpose. This work concerns the relationship between these automated measures and layperson comprehension of biomedical text. To characterize this relationship, we applied a range of automated measures of text coherence to a set of text snippets, some of which were deliberately modified to improve their accessibility in a series of reading comprehension experiments. Results indicate significant associations between reader comprehension - as estimated using multiple-choice questions - and LLM-derived coherence metrics. Interventions designed to improve the comprehensibility of passages also improved their coherence, as measured with the best-performing LLM-derived models and shown by improved reader understanding of the text. These findings support the utility of LLM-derived measures of text coherence as a means to identify gaps in connectedness that make biomedical text difficult for laypeople to understand, with the potential to inform both manual and automated methods to improve the accessibility of the biomedical literature.</p>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":" ","pages":"104758"},"PeriodicalIF":4.0,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142813110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing suicidal behavior detection in EHRs: A multi-label NLP framework with transformer models and semantic retrieval-based annotation.
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-12-02 DOI: 10.1016/j.jbi.2024.104755
Kimia Zandbiglari, Shobhan Kumar, Muhammad Bilal, Amie Goodin, Masoud Rouhizadeh
{"title":"Enhancing suicidal behavior detection in EHRs: A multi-label NLP framework with transformer models and semantic retrieval-based annotation.","authors":"Kimia Zandbiglari, Shobhan Kumar, Muhammad Bilal, Amie Goodin, Masoud Rouhizadeh","doi":"10.1016/j.jbi.2024.104755","DOIUrl":"10.1016/j.jbi.2024.104755","url":null,"abstract":"<p><strong>Background: </strong>Suicide is a leading cause of death worldwide, making early identification of suicidal behaviors crucial for clinicians. Current Natural Language Processing (NLP) approaches for identifying suicidal behaviors in Electronic Health Records (EHRs) rely on keyword searches, rule-based methods, and binary classification, which may not fully capture the complexity and spectrum of suicidal behaviors. This study aims to create a multi-class labeled dataset with annotation guidelines and develop a novel NLP approach for fine-grained, multi-label classification of suicidal behaviors, improving the efficiency of the annotation process and accuracy of the NLP methods.</p><p><strong>Methods: </strong>We develop a multi-class labeling system based on guidelines from FDA, CDC, and WHO, distinguishing between six categories of suicidal behaviors and allowing for multiple labels per data sample. To efficiently create an annotated dataset, we use an MPNet-based semantic retrieval framework to extract relevant sentences from a large EHR dataset, reducing annotation space while capturing diverse expressions. Experts annotate the extracted sentences using the multi-class system. We then formulate the task as a multi-label classification problem and fine-tune transformer-based models on the curated dataset to accurately classify suicidal behaviors in EHRs.</p><p><strong>Results: </strong>Lexical analysis revealed key themes in assessing suicide risk, considering an individual's history, mental health, substance use, and family background. Fine-tuned transformer-based models effectively identified suicidal behaviors from EHRs, with Bio_ClinicalBERT, BioBERT, and XLNet achieving the F1 scores (0.81), outperforming BERT and RoBERTa. The proposed approach, based on a multi-label classification system, captures the complexity of suicidal behaviors effectively particularly \"Suicide Attempt\" and \"Family History\" instances. The proposed approach, using task-specific NLP models and a multi-label classification system, captures the complexity of suicidal behaviors more effectively than traditional binary classification. However, direct comparisons with existing studies are difficult due to varying metrics and label definitions.</p><p><strong>Conclusion: </strong>This study presents a robust NLP framework for detecting suicidal behaviors in EHRs, leveraging task-specific fine-tuning of transformer-based models and a semi-automated pipeline. Despite limitations, the approach demonstrates the potential of advanced NLP techniques in enhancing the identification of suicidal behaviors. Future work should focus on model expansion and integration to further improve patient care and clinical decision-making.</p>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":" ","pages":"104755"},"PeriodicalIF":4.0,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142780151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Early multi-cancer detection through deep learning: An anomaly detection approach using Variational Autoencoder. 通过深度学习进行早期多癌检测:使用变异自动编码器的异常检测方法。
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-12-01 Epub Date: 2024-11-19 DOI: 10.1016/j.jbi.2024.104751
Innocent Tatchum Sado, Louis Fippo Fitime, Geraud Fokou Pelap, Claude Tinku, Gaelle Mireille Meudje, Thomas Bouetou Bouetou
{"title":"Early multi-cancer detection through deep learning: An anomaly detection approach using Variational Autoencoder.","authors":"Innocent Tatchum Sado, Louis Fippo Fitime, Geraud Fokou Pelap, Claude Tinku, Gaelle Mireille Meudje, Thomas Bouetou Bouetou","doi":"10.1016/j.jbi.2024.104751","DOIUrl":"10.1016/j.jbi.2024.104751","url":null,"abstract":"<p><p>Cancer is a disease that causes many deaths worldwide. The treatment of cancer is first and foremost a matter of detection, a treatment that is most effective when the disease is detected at an early stage. With the evolution of technology, several computer-aided diagnosis tools have been developed around cancer; several image-based cancer detection methods have been developed too. However, cancer detection faces many difficulties related to early detection which is crucial for patient survival rate. To detect cancer early, scientists have been using transcriptomic data. However, this presents some challenges such as unlabeled data, a large amount of data, and image-based techniques that only focus on one type of cancer. The purpose of this work is to develop a deep learning model that can effectively detect as soon as possible, specifically in the early stages, any type of cancer as an anomaly in transcriptomic data. This model must have the ability to act independently and not be restricted to any specific type of cancer. To achieve this goal, we modeled a deep neural network (a Variational Autoencoder) and then defined an algorithm for detecting anomalies in the output of the Variational Autoencoder. The Variational Autoencoder consists of an encoder and a decoder with a hidden layer. With the TCGA and GTEx data, we were able to train the model for six types of cancer using the Adam optimizer with decay learning for training, and a two-component loss function. As a result, we obtained the lowest value of accuracy 0.950, and the lowest value of recall 0.830. This research leads us to the design of a deep learning model for the detection of cancer as an anomaly in transcriptomic data.</p>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":" ","pages":"104751"},"PeriodicalIF":4.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142687219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How to identify patient perception of AI voice robots in the follow-up scenario? A multimodal identity perception method based on deep learning.
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-12-01 Epub Date: 2024-12-02 DOI: 10.1016/j.jbi.2024.104757
Mingjie Liu, Kuiyou Chen, Qing Ye, Hong Wu
{"title":"How to identify patient perception of AI voice robots in the follow-up scenario? A multimodal identity perception method based on deep learning.","authors":"Mingjie Liu, Kuiyou Chen, Qing Ye, Hong Wu","doi":"10.1016/j.jbi.2024.104757","DOIUrl":"10.1016/j.jbi.2024.104757","url":null,"abstract":"<p><strong>Objectives: </strong>Post-discharge follow-up stands as a critical component of post-diagnosis management, and the constraints of healthcare resources impede comprehensive manual follow-up. However, patients are less cooperative with AI follow-up calls or may even hang up once AI voice robots are perceived. To improve the effectiveness of follow-up, alternative measures should be taken when patients perceive AI voice robots. Therefore, identifying how patients perceive AI voice robots is crucial. This study aims to construct a multimodal identity perception model based on deep learning to identify how patients perceive AI voice robots.</p><p><strong>Methods: </strong>Our dataset includes 2030 response audio recordings and corresponding texts from patients. We conduct comparative experiments and perform an ablation study. The proposed model employs a transfer learning approach, utilizing BERT and TextCNN for text feature extraction, AST and LSTM for audio feature extraction, and self-attention for feature fusion.</p><p><strong>Results: </strong>Our model demonstrates superior performance against existing baselines, with a precision of 86.67%, an AUC of 84%, and an accuracy of 94.38%. Additionally, a generalization experiment was conducted using 144 patients' response audio recordings and corresponding text data from other departments in the hospital, confirming the model's robustness and effectiveness.</p><p><strong>Conclusion: </strong>Our multimodal identity perception model can identify how patients perceive AI voice robots effectively. Identifying how patients perceive AI not only helps to optimize the follow-up process and improve patient cooperation, but also provides support for the evaluation and optimization of AI voice robots.</p>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":" ","pages":"104757"},"PeriodicalIF":4.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142780183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Biomedical document-level relation extraction with thematic capture and localized entity pooling.
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-12-01 Epub Date: 2024-11-30 DOI: 10.1016/j.jbi.2024.104756
Yuqing Li, Xinhui Shao
{"title":"Biomedical document-level relation extraction with thematic capture and localized entity pooling.","authors":"Yuqing Li, Xinhui Shao","doi":"10.1016/j.jbi.2024.104756","DOIUrl":"10.1016/j.jbi.2024.104756","url":null,"abstract":"<p><p>In contrast to sentence-level relational extraction, document-level relation extraction poses greater challenges as a document typically contains multiple entities, and one entity may be associated with multiple other entities. Existing methods often rely on graph structures to capture path representations between entity pairs. However, this paper introduces a novel approach called local entity pooling that solely relies on the pre-training model to identify the bridge entity related to the current entity pair and generate the reasoning path representation. This technique effectively mitigates the multi-entity problem. Additionally, the model leverages the multi-entity and multi-label characteristics of the document to acquire the document's thematic representation, thereby enhancing the document-level relation extraction task. Experimental evaluations conducted on two biomedical datasets, CDR and GDA. Our TCLEP (Thematic Capture and Localized Entity Pooling) model achieved the Macro-F1 scores of 71.7% and 85.3%, respectively. Simultaneously, we incorporated local entity pooling and thematic capture modules into the state-of-the-art model, resulting in performance improvements of 1.5% and 0.2% on the respective datasets. These results highlight the advanced performance of our proposed approach.</p>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":" ","pages":"104756"},"PeriodicalIF":4.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142769374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Taxonomy-based prompt engineering to generate synthetic drug-related patient portal messages. 基于分类学的提示工程,生成合成的药物相关患者门户信息。
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-12-01 Epub Date: 2024-11-25 DOI: 10.1016/j.jbi.2024.104752
Natalie Wang, Sukrit Treewaree, Ayah Zirikly, Yuzhi L Lu, Michelle H Nguyen, Bhavik Agarwal, Jash Shah, James Michael Stevenson, Casey Overby Taylor
{"title":"Taxonomy-based prompt engineering to generate synthetic drug-related patient portal messages.","authors":"Natalie Wang, Sukrit Treewaree, Ayah Zirikly, Yuzhi L Lu, Michelle H Nguyen, Bhavik Agarwal, Jash Shah, James Michael Stevenson, Casey Overby Taylor","doi":"10.1016/j.jbi.2024.104752","DOIUrl":"10.1016/j.jbi.2024.104752","url":null,"abstract":"<p><strong>Objective: </strong>The objectives of this study were to: (1) create a corpus of synthetic drug-related patient portal messages to address the current lack of publicly available datasets for model development, (2) assess differences in language used and linguistics among the synthetic patient portal messages, and (3) assess the accuracy of patient-reported drug side effects for different racial groups.</p><p><strong>Methods: </strong>We leveraged a taxonomy for patient- and clinician-generated content to guide prompt engineering for synthetic drug-related patient portal messages. We generated two groups of messages: the first group (200 messages) used a subset of the taxonomy relevant to a broad range of drug-related messages and the second group (250 messages) used a subset of the taxonomy relevant to a narrow range of messages focused on side effects. Prompts also include one of five racial groups. Next, we assessed linguistic characteristics among message parts (subject, beginning, body, ending) across different prompt specifications (urgency, patient portal taxa, race). We also assessed the performance and frequency of patient-reported side effects across different racial groups and compared to data present in a real world data source (SIDER).</p><p><strong>Results: </strong>The study generated 450 synthetic patient portal messages, and we assessed linguistic patterns, accuracy of drug-side effect pairs, frequency of pairs compared to real world data. Linguistic analysis revealed variations in language usage and politeness and analysis of positive predictive values identified differences in symptoms reported based on urgency levels and racial groups in the prompt. We also found that low incident SIDER drug-side effect pairs were observed less frequently in our dataset.</p><p><strong>Conclusion: </strong>This study demonstrates the potential of synthetic patient portal messages as a valuable resource for healthcare research. After creating a corpus of synthetic drug-related patient portal messages, we identified significant language differences and provided evidence that drug-side effect pairs observed in messages are comparable to what is expected in real world settings.</p>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":" ","pages":"104752"},"PeriodicalIF":4.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142739561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sleep apnea test prediction based on Electronic Health Records 基于电子健康记录的睡眠呼吸暂停测试预测。
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-12-01 DOI: 10.1016/j.jbi.2024.104737
Lama Abu Tahoun , Amit Shay Green , Tal Patalon , Yaron Dagan , Robert Moskovitch
{"title":"Sleep apnea test prediction based on Electronic Health Records","authors":"Lama Abu Tahoun ,&nbsp;Amit Shay Green ,&nbsp;Tal Patalon ,&nbsp;Yaron Dagan ,&nbsp;Robert Moskovitch","doi":"10.1016/j.jbi.2024.104737","DOIUrl":"10.1016/j.jbi.2024.104737","url":null,"abstract":"<div><div>The identification of Obstructive Sleep Apnea (OSA) is done by a Polysomnography test which is often done in later ages. Being able to notify potential insured members at earlier ages is desirable. For that, we develop predictive models that rely on Electronic Health Records (EHR) and predict whether a person will go through a sleep apnea test after the age of 50. A major challenge is the variability in EHR records in various insured members over the years, which this study investigates as well in the context of controls matching, and prediction. Since there are many temporal variables, the RankLi method was introduced for temporal variable selection. This approach employs the t-test to calculate a divergence score for each temporal variable between the target classes. We also investigate here the need to consider the number of EHR records, as part of control matching, and whether modeling separately for subgroups according to the number of EHR records is more effective. For each prediction task, we trained 4 different classifiers including 1-CNN, LSTM, Random Forest, and Logistic Regression, on data until the age of 40 or 50, and on several numbers of temporal variables. Using the number of EHR records for control matching was found crucial, and using learning models for subsets of the population according to the number of EHR records they have was found more effective. The deep learning models, particularly the 1-CNN, achieved the highest balanced accuracy and AUC scores in both male and female groups. In the male group, the highest results were also observed at age 50 with 100 temporal variables, resulting in a balanced accuracy of 90% and an AUC of 93%.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"160 ","pages":"Article 104737"},"PeriodicalIF":4.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142568735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structural analysis and intelligent classification of clinical trial eligibility criteria based on deep learning and medical text mining 基于深度学习和医学文本挖掘的临床试验资格标准的结构分析和智能分类。
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-12-01 DOI: 10.1016/j.jbi.2024.104753
Yongzhong Han , Qianmin Su , Liang Liu , Ying Li , Jihan Huang
{"title":"Structural analysis and intelligent classification of clinical trial eligibility criteria based on deep learning and medical text mining","authors":"Yongzhong Han ,&nbsp;Qianmin Su ,&nbsp;Liang Liu ,&nbsp;Ying Li ,&nbsp;Jihan Huang","doi":"10.1016/j.jbi.2024.104753","DOIUrl":"10.1016/j.jbi.2024.104753","url":null,"abstract":"<div><h3>Objective:</h3><div>To enhance the efficiency, quality, and innovation capability of clinical trials, this paper introduces a novel model called CTEC-AC (Clinical Trial Eligibility Criteria Automatic Classification), aimed at structuring clinical trial eligibility criteria into computationally explainable classifications.</div></div><div><h3>Methods:</h3><div>We obtained detailed information on the latest 2,500 clinical trials from ClinicalTrials.gov, generating over 20,000 eligibility criteria data entries. To enhance the expressiveness of these criteria, we integrated two powerful methods: ClinicalBERT and MetaMap. The resulting enhanced features were used as input for a hierarchical clustering algorithm. Post-processing included expert validation of the algorithm’s output to ensure the accuracy of the constructed annotated eligibility text corpus. Ultimately, our model was employed to automate the classification of eligibility criteria.</div></div><div><h3>Results:</h3><div>We identified 31 distinct categories to summarize the eligibility criteria written by clinical researchers and uncovered common themes in how these criteria are expressed. Using our automated classification model on a labeled dataset, we achieved a macro-average F1 score of 0.94.</div></div><div><h3>Conclusion:</h3><div>This work can automatically extract structured representations from unstructured eligibility criteria text, significantly advancing the informatization of clinical trials. This, in turn, can significantly enhance the intelligence of automated participant recruitment for clinical researchers.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"160 ","pages":"Article 104753"},"PeriodicalIF":4.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142739557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multimodal approach for few-shot biomedical named entity recognition in low-resource languages.
IF 4 2区 医学
Journal of Biomedical Informatics Pub Date : 2024-11-30 DOI: 10.1016/j.jbi.2024.104754
Jian Chen, Leilei Su, Yihong Li, Mingquan Lin, Yifan Peng, Cong Sun
{"title":"A multimodal approach for few-shot biomedical named entity recognition in low-resource languages.","authors":"Jian Chen, Leilei Su, Yihong Li, Mingquan Lin, Yifan Peng, Cong Sun","doi":"10.1016/j.jbi.2024.104754","DOIUrl":"10.1016/j.jbi.2024.104754","url":null,"abstract":"<p><p>In this study, we revisit named entity recognition (NER) in the biomedical domain from a multimodal perspective, with a particular focus on applications in low-resource languages. Existing research primarily relies on unimodal methods for NER, which limits the potential for capturing diverse information. To address this limitation, we propose a novel method that integrates a cross-modal generation module to transform unimodal data into multimodal data, thereby enabling the use of enriched multimodal information for NER. Additionally, we design a cross-modal filtering module to mitigate the adverse effects of text-image mismatches in multimodal NER. We validate our proposed method on two biomedical datasets specifically curated for low-resource languages. Experimental results demonstrate that our method significantly enhances the performance of NER, highlighting its effectiveness and potential for broader applications in biomedical research and low-resource language contexts.</p>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":" ","pages":"104754"},"PeriodicalIF":4.0,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142769352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信