Helge Brandberg, Carl Johan Sundberg, Jonas Spaak, Sabine Koch, Thomas Kahan
{"title":"Are medical history data fit for risk stratification of patients with chest pain in emergency care? Comparing data collected from patients using computerized history taking with data documented by physicians in the electronic health record in the CLEOS-CPDS prospective cohort study.","authors":"Helge Brandberg, Carl Johan Sundberg, Jonas Spaak, Sabine Koch, Thomas Kahan","doi":"10.1093/jamia/ocae110","DOIUrl":"10.1093/jamia/ocae110","url":null,"abstract":"<p><strong>Objective: </strong>In acute chest pain management, risk stratification tools, including medical history, are recommended. We compared the fraction of patients with sufficient clinical data obtained using computerized history taking software (CHT) versus physician-acquired medical history to calculate established risk scores and assessed the patient-by-patient agreement between these 2 ways of obtaining medical history information.</p><p><strong>Materials and methods: </strong>This was a prospective cohort study of clinically stable patients aged ≥ 18 years presenting to the emergency department (ED) at Danderyd University Hospital (Stockholm, Sweden) in 2017-2019 with acute chest pain and non-diagnostic ECG and serum markers. Medical histories were self-reported using CHT on a tablet. Observations on discrete variables in the risk scores were extracted from electronic health records (EHR) and the CHT database. The patient-by-patient agreement was described by Cohen's kappa statistics.</p><p><strong>Results: </strong>Of the total 1000 patients included (mean age 55.3 ± 17.4 years; 54% women), HEART score, EDACS, and T-MACS could be calculated in 75%, 74%, and 83% by CHT and in 31%, 7%, and 25% by EHR, respectively. The agreement between CHT and EHR was slight to moderate (kappa 0.19-0.70) for chest pain characteristics and moderate to almost perfect (kappa 0.55-0.91) for risk factors.</p><p><strong>Conclusions: </strong>CHT can acquire and document data for chest pain risk stratification in most ED patients using established risk scores, achieving this goal for a substantially larger number of patients, as compared to EHR data. The agreement between CHT and physician-acquired history taking is high for traditional risk factors and lower for chest pain characteristics.</p><p><strong>Clinical trial registration: </strong>ClinicalTrials.gov NCT03439449.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187423/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141088695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Erika Rasnick Manning, Qing Duan, Stuart Taylor, Sarah Ray, Alexandra M S Corley, Joseph Michael, Ryan Gillette, Ndidi Unaka, David Hartley, Andrew F Beck, Cole Brokamp
{"title":"Development of a multimodal geomarker pipeline to assess the impact of social, economic, and environmental factors on pediatric health outcomes.","authors":"Erika Rasnick Manning, Qing Duan, Stuart Taylor, Sarah Ray, Alexandra M S Corley, Joseph Michael, Ryan Gillette, Ndidi Unaka, David Hartley, Andrew F Beck, Cole Brokamp","doi":"10.1093/jamia/ocae093","DOIUrl":"10.1093/jamia/ocae093","url":null,"abstract":"<p><strong>Objectives: </strong>We sought to create a computational pipeline for attaching geomarkers, contextual or geographic measures that influence or predict health, to electronic health records at scale, including developing a tool for matching addresses to parcels to assess the impact of housing characteristics on pediatric health.</p><p><strong>Materials and methods: </strong>We created a geomarker pipeline to link residential addresses from hospital admissions at Cincinnati Children's Hospital Medical Center (CCHMC) between July 2016 and June 2022 to place-based data. Linkage methods included by date of admission, geocoding to census tract, street range geocoding, and probabilistic address matching. We assessed 4 methods for probabilistic address matching.</p><p><strong>Results: </strong>We characterized 124 244 hospitalizations experienced by 69 842 children admitted to CCHMC. Of the 55 684 hospitalizations with residential addresses in Hamilton County, Ohio, all were matched to 7 temporal geomarkers, 97% were matched to 79 census tract-level geomarkers and 13 point-level geomarkers, and 75% were matched to 16 parcel-level geomarkers. Parcel-level geomarkers were linked using our exact address matching tool developed using the best-performing linkage method.</p><p><strong>Discussion: </strong>Our multimodal geomarker pipeline provides a reproducible framework for attaching place-based data to health data while maintaining data privacy. This framework can be applied to other populations and in other regions. We also created a tool for address matching that democratizes parcel-level data to advance precision population health efforts.</p><p><strong>Conclusion: </strong>We created an open framework for multimodal geomarker assessment by harmonizing and linking a set of over 100 geomarkers to hospitalization data, enabling assessment of links between geomarkers and hospital admissions.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187418/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140909123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Egill A Fridgeirsson, Ross Williams, Peter Rijnbeek, Marc A Suchard, Jenna M Reps
{"title":"Comparing penalization methods for linear models on large observational health data.","authors":"Egill A Fridgeirsson, Ross Williams, Peter Rijnbeek, Marc A Suchard, Jenna M Reps","doi":"10.1093/jamia/ocae109","DOIUrl":"10.1093/jamia/ocae109","url":null,"abstract":"<p><strong>Objective: </strong>This study evaluates regularization variants in logistic regression (L1, L2, ElasticNet, Adaptive L1, Adaptive ElasticNet, Broken adaptive ridge [BAR], and Iterative hard thresholding [IHT]) for discrimination and calibration performance, focusing on both internal and external validation.</p><p><strong>Materials and methods: </strong>We use data from 5 US claims and electronic health record databases and develop models for various outcomes in a major depressive disorder patient population. We externally validate all models in the other databases. We use a train-test split of 75%/25% and evaluate performance with discrimination and calibration. Statistical analysis for difference in performance uses Friedman's test and critical difference diagrams.</p><p><strong>Results: </strong>Of the 840 models we develop, L1 and ElasticNet emerge as superior in both internal and external discrimination, with a notable AUC difference. BAR and IHT show the best internal calibration, without a clear external calibration leader. ElasticNet typically has larger model sizes than L1. Methods like IHT and BAR, while slightly less discriminative, significantly reduce model complexity.</p><p><strong>Conclusion: </strong>L1 and ElasticNet offer the best discriminative performance in logistic regression for healthcare predictions, maintaining robustness across validations. For simpler, more interpretable models, L0-based methods (IHT and BAR) are advantageous, providing greater parsimony and calibration with fewer features. This study aids in selecting suitable regularization techniques for healthcare prediction models, balancing performance, complexity, and interpretability.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187433/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141066443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cornelius Born, Romy Schwarz, Timo Phillip Böttcher, Andreas Hein, Helmut Krcmar
{"title":"The role of information systems in emergency department decision-making-a literature review.","authors":"Cornelius Born, Romy Schwarz, Timo Phillip Böttcher, Andreas Hein, Helmut Krcmar","doi":"10.1093/jamia/ocae096","DOIUrl":"10.1093/jamia/ocae096","url":null,"abstract":"<p><strong>Objectives: </strong>Healthcare providers employ heuristic and analytical decision-making to navigate the high-stakes environment of the emergency department (ED). Despite the increasing integration of information systems (ISs), research on their efficacy is conflicting. Drawing on related fields, we investigate how timing and mode of delivery influence IS effectiveness. Our objective is to reconcile previous contradictory findings, shedding light on optimal IS design in the ED.</p><p><strong>Materials and methods: </strong>We conducted a systematic review following PRISMA across PubMed, Scopus, and Web of Science. We coded the ISs' timing as heuristic or analytical, their mode of delivery as active for automatic alerts and passive when requiring user-initiated information retrieval, and their effect on process, economic, and clinical outcomes.</p><p><strong>Results: </strong>Our analysis included 83 studies. During early heuristic decision-making, most active interventions were ineffective, while passive interventions generally improved outcomes. In the analytical phase, the effects were reversed. Passive interventions that facilitate information extraction consistently improved outcomes.</p><p><strong>Discussion: </strong>Our findings suggest that the effectiveness of active interventions negatively correlates with the amount of information received during delivery. During early heuristic decision-making, when information overload is high, physicians are unresponsive to alerts and proactively consult passive resources. In the later analytical phases, physicians show increased receptivity to alerts due to decreased diagnostic uncertainty and information quantity. Interventions that limit information lead to positive outcomes, supporting our interpretation.</p><p><strong>Conclusion: </strong>We synthesize our findings into an integrated model that reveals the underlying reasons for conflicting findings from previous reviews and can guide practitioners in designing ISs in the ED.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187435/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141088797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Beaney, Sneha Jha, Asem Alaa, Alexander Smith, Jonathan Clarke, Thomas Woodcock, Azeem Majeed, Paul Aylin, Mauricio Barahona
{"title":"Comparing natural language processing representations of coded disease sequences for prediction in electronic health records.","authors":"Thomas Beaney, Sneha Jha, Asem Alaa, Alexander Smith, Jonathan Clarke, Thomas Woodcock, Azeem Majeed, Paul Aylin, Mauricio Barahona","doi":"10.1093/jamia/ocae091","DOIUrl":"10.1093/jamia/ocae091","url":null,"abstract":"<p><strong>Objective: </strong>Natural language processing (NLP) algorithms are increasingly being applied to obtain unsupervised representations of electronic health record (EHR) data, but their comparative performance at predicting clinical endpoints remains unclear. Our objective was to compare the performance of unsupervised representations of sequences of disease codes generated by bag-of-words versus sequence-based NLP algorithms at predicting clinically relevant outcomes.</p><p><strong>Materials and methods: </strong>This cohort study used primary care EHRs from 6 286 233 people with Multiple Long-Term Conditions in England. For each patient, an unsupervised vector representation of their time-ordered sequences of diseases was generated using 2 input strategies (212 disease categories versus 9462 diagnostic codes) and different NLP algorithms (Latent Dirichlet Allocation, doc2vec, and 2 transformer models designed for EHRs). We also developed a transformer architecture, named EHR-BERT, incorporating sociodemographic information. We compared the performance of each of these representations (without fine-tuning) as inputs into a logistic classifier to predict 1-year mortality, healthcare use, and new disease diagnosis.</p><p><strong>Results: </strong>Patient representations generated by sequence-based algorithms performed consistently better than bag-of-words methods in predicting clinical endpoints, with the highest performance for EHR-BERT across all tasks, although the absolute improvement was small. Representations generated using disease categories perform similarly to those using diagnostic codes as inputs, suggesting models can equally manage smaller or larger vocabularies for prediction of these outcomes.</p><p><strong>Discussion and conclusion: </strong>Patient representations produced by sequence-based NLP algorithms from sequences of disease codes demonstrate improved predictive content for patient outcomes compared with representations generated by co-occurrence-based algorithms. This suggests transformer models may be useful for generating multi-purpose representations, even without fine-tuning.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187492/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140892335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziqing Ji, Siyan Guo, Yujie Qiao, Robert A McDougal
{"title":"Automating literature screening and curation with applications to computational neuroscience.","authors":"Ziqing Ji, Siyan Guo, Yujie Qiao, Robert A McDougal","doi":"10.1093/jamia/ocae097","DOIUrl":"10.1093/jamia/ocae097","url":null,"abstract":"<p><strong>Objective: </strong>ModelDB (https://modeldb.science) is a discovery platform for computational neuroscience, containing over 1850 published model codes with standardized metadata. These codes were mainly supplied from unsolicited model author submissions, but this approach is inherently limited. For example, we estimate we have captured only around one-third of NEURON models, the most common type of models in ModelDB. To more completely characterize the state of computational neuroscience modeling work, we aim to identify works containing results derived from computational neuroscience approaches and their standardized associated metadata (eg, cell types, research topics).</p><p><strong>Materials and methods: </strong>Known computational neuroscience work from ModelDB and identified neuroscience work queried from PubMed were included in our study. After pre-screening with SPECTER2 (a free document embedding method), GPT-3.5, and GPT-4 were used to identify likely computational neuroscience work and relevant metadata.</p><p><strong>Results: </strong>SPECTER2, GPT-4, and GPT-3.5 demonstrated varied but high abilities in identification of computational neuroscience work. GPT-4 achieved 96.9% accuracy and GPT-3.5 improved from 54.2% to 85.5% through instruction-tuning and Chain of Thought. GPT-4 also showed high potential in identifying relevant metadata annotations.</p><p><strong>Discussion: </strong>Accuracy in identification and extraction might further be improved by dealing with ambiguity of what are computational elements, including more information from papers (eg, Methods section), improving prompts, etc.</p><p><strong>Conclusion: </strong>Natural language processing and large language model techniques can be added to ModelDB to facilitate further model discovery, and will contribute to a more standardized and comprehensive framework for establishing domain-specific resources.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187430/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140900027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mark P Sendak, Vincent X Liu, Ashley Beecy, David E Vidal, Keo Shaw, Mark A Lifson, Danny Tobey, Alexandra Valladares, Brenna Loufek, Murtaza Mogri, Suresh Balu
{"title":"Strengthening the use of artificial intelligence within healthcare delivery organizations: balancing regulatory compliance and patient safety.","authors":"Mark P Sendak, Vincent X Liu, Ashley Beecy, David E Vidal, Keo Shaw, Mark A Lifson, Danny Tobey, Alexandra Valladares, Brenna Loufek, Murtaza Mogri, Suresh Balu","doi":"10.1093/jamia/ocae119","DOIUrl":"10.1093/jamia/ocae119","url":null,"abstract":"<p><strong>Objectives: </strong>Surface the urgent dilemma that healthcare delivery organizations (HDOs) face navigating the US Food and Drug Administration (FDA) final guidance on the use of clinical decision support (CDS) software.</p><p><strong>Materials and methods: </strong>We use sepsis as a case study to highlight the patient safety and regulatory compliance tradeoffs that 6129 hospitals in the United States must navigate.</p><p><strong>Results: </strong>Sepsis CDS remains in broad, routine use. There is no commercially available sepsis CDS system that is FDA cleared as a medical device. There is no public disclosure of an HDO turning off sepsis CDS due to regulatory compliance concerns. And there is no public disclosure of FDA enforcement action against an HDO for using sepsis CDS that is not cleared as a medical device.</p><p><strong>Discussion and conclusion: </strong>We present multiple policy interventions that would relieve the current tension to enable HDOs to utilize artificial intelligence to improve patient care while also addressing FDA concerns about product safety, efficacy, and equity.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187419/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141066453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Addressing methodological and logistical challenges of using electronic health record (EHR) data for research.","authors":"Suzanne Bakken","doi":"10.1093/jamia/ocae126","DOIUrl":"10.1093/jamia/ocae126","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187415/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141428026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shiyao Xie, Wenjing Zhao, Guanghui Deng, Guohua He, Na He, Zhenhua Lu, Weihua Hu, Mingming Zhao, Jian Du
{"title":"Utilizing ChatGPT as a scientific reasoning engine to differentiate conflicting evidence and summarize challenges in controversial clinical questions.","authors":"Shiyao Xie, Wenjing Zhao, Guanghui Deng, Guohua He, Na He, Zhenhua Lu, Weihua Hu, Mingming Zhao, Jian Du","doi":"10.1093/jamia/ocae100","DOIUrl":"10.1093/jamia/ocae100","url":null,"abstract":"<p><strong>Objective: </strong>Synthesizing and evaluating inconsistent medical evidence is essential in evidence-based medicine. This study aimed to employ ChatGPT as a sophisticated scientific reasoning engine to identify conflicting clinical evidence and summarize unresolved questions to inform further research.</p><p><strong>Materials and methods: </strong>We evaluated ChatGPT's effectiveness in identifying conflicting evidence and investigated its principles of logical reasoning. An automated framework was developed to generate a PubMed dataset focused on controversial clinical topics. ChatGPT analyzed this dataset to identify consensus and controversy, and to formulate unsolved research questions. Expert evaluations were conducted 1) on the consensus and controversy for factual consistency, comprehensiveness, and potential harm and, 2) on the research questions for relevance, innovation, clarity, and specificity.</p><p><strong>Results: </strong>The gpt-4-1106-preview model achieved a 90% recall rate in detecting inconsistent claim pairs within a ternary assertions setup. Notably, without explicit reasoning prompts, ChatGPT provided sound reasoning for the assertions between claims and hypotheses, based on an analysis grounded in relevance, specificity, and certainty. ChatGPT's conclusions of consensus and controversies in clinical literature were comprehensive and factually consistent. The research questions proposed by ChatGPT received high expert ratings.</p><p><strong>Discussion: </strong>Our experiment implies that, in evaluating the relationship between evidence and claims, ChatGPT considered more detailed information beyond a straightforward assessment of sentimental orientation. This ability to process intricate information and conduct scientific reasoning regarding sentiment is noteworthy, particularly as this pattern emerged without explicit guidance or directives in prompts, highlighting ChatGPT's inherent logical reasoning capabilities.</p><p><strong>Conclusion: </strong>This study demonstrated ChatGPT's capacity to evaluate and interpret scientific claims. Such proficiency can be generalized to broader clinical research literature. ChatGPT effectively aids in facilitating clinical studies by proposing unresolved challenges based on analysis of existing studies. However, caution is advised as ChatGPT's outputs are inferences drawn from the input literature and could be harmful to clinical practice.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187493/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140960509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yining Hua, Jiageng Wu, Shixu Lin, Minghui Li, Yujie Zhang, Dinah Foer, Siwen Wang, Peilin Zhou, Jie Yang, Li Zhou
{"title":"Streamlining social media information retrieval for public health research with deep learning.","authors":"Yining Hua, Jiageng Wu, Shixu Lin, Minghui Li, Yujie Zhang, Dinah Foer, Siwen Wang, Peilin Zhou, Jie Yang, Li Zhou","doi":"10.1093/jamia/ocae118","DOIUrl":"10.1093/jamia/ocae118","url":null,"abstract":"<p><strong>Objective: </strong>Social media-based public health research is crucial for epidemic surveillance, but most studies identify relevant corpora with keyword-matching. This study develops a system to streamline the process of curating colloquial medical dictionaries. We demonstrate the pipeline by curating a Unified Medical Language System (UMLS)-colloquial symptom dictionary from COVID-19-related tweets as proof of concept.</p><p><strong>Methods: </strong>COVID-19-related tweets from February 1, 2020, to April 30, 2022 were used. The pipeline includes three modules: a named entity recognition module to detect symptoms in tweets; an entity normalization module to aggregate detected entities; and a mapping module that iteratively maps entities to Unified Medical Language System concepts. A random 500 entity samples were drawn from the final dictionary for accuracy validation. Additionally, we conducted a symptom frequency distribution analysis to compare our dictionary to a pre-defined lexicon from previous research.</p><p><strong>Results: </strong>We identified 498 480 unique symptom entity expressions from the tweets. Pre-processing reduces the number to 18 226. The final dictionary contains 38 175 unique expressions of symptoms that can be mapped to 966 UMLS concepts (accuracy = 95%). Symptom distribution analysis found that our dictionary detects more symptoms and is effective at identifying psychiatric disorders like anxiety and depression, often missed by pre-defined lexicons.</p><p><strong>Conclusions: </strong>This study advances public health research by implementing a novel, systematic pipeline for curating symptom lexicons from social media data. The final lexicon's high accuracy, validated by medical professionals, underscores the potential of this methodology to reliably interpret, and categorize vast amounts of unstructured social media data into actionable medical insights across diverse linguistic and regional landscapes.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11187427/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140892377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}