{"title":"Extraction of Substance Use Information From Clinical Notes: Generative Pretrained Transformer-Based Investigation.","authors":"Fatemeh Shah-Mohammadi, Joseph Finkelstein","doi":"10.2196/56243","DOIUrl":"10.2196/56243","url":null,"abstract":"<p><strong>Background: </strong>Understanding the multifaceted nature of health outcomes requires a comprehensive examination of the social, economic, and environmental determinants that shape individual well-being. Among these determinants, behavioral factors play a crucial role, particularly the consumption patterns of psychoactive substances, which have important implications on public health. The Global Burden of Disease Study shows a growing impact in disability-adjusted life years due to substance use. The successful identification of patients' substance use information equips clinical care teams to address substance-related issues more effectively, enabling targeted support and ultimately improving patient outcomes.</p><p><strong>Objective: </strong>Traditional natural language processing methods face limitations in accurately parsing diverse clinical language associated with substance use. Large language models offer promise in overcoming these challenges by adapting to diverse language patterns. This study investigates the application of the generative pretrained transformer (GPT) model in specific GPT-3.5 for extracting tobacco, alcohol, and substance use information from patient discharge summaries in zero-shot and few-shot learning settings. This study contributes to the evolving landscape of health care informatics by showcasing the potential of advanced language models in extracting nuanced information critical for enhancing patient care.</p><p><strong>Methods: </strong>The main data source for analysis in this paper is Medical Information Mart for Intensive Care III data set. Among all notes in this data set, we focused on discharge summaries. Prompt engineering was undertaken, involving an iterative exploration of diverse prompts. Leveraging carefully curated examples and refined prompts, we investigate the model's proficiency through zero-shot as well as few-shot prompting strategies.</p><p><strong>Results: </strong>Results show GPT's varying effectiveness in identifying mentions of tobacco, alcohol, and substance use across learning scenarios. Zero-shot learning showed high accuracy in identifying substance use, whereas few-shot learning reduced accuracy but improved in identifying substance use status, enhancing recall and F<sub>1</sub>-score at the expense of lower precision.</p><p><strong>Conclusions: </strong>Excellence of zero-shot learning in precisely extracting text span mentioning substance use demonstrates its effectiveness in situations in which comprehensive recall is important. Conversely, few-shot learning offers advantages when accurately determining the status of substance use is the primary focus, even if it involves a trade-off in precision. The results contribute to enhancement of early detection and intervention strategies, tailor treatment plans with greater precision, and ultimately, contribute to a holistic understanding of patient health profiles. By integrating these artificial intelligence-driven method","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":" ","pages":"e56243"},"PeriodicalIF":3.1,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11369538/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141735797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gaetan Kamdje Wabo, Preetha Moorthy, Fabian Siegel, Susanne A Seuchter, Thomas Ganslandt
{"title":"Evaluating and Enhancing the Fitness-for-Purpose of Electronic Health Record Data: Qualitative Study on Current Practices and Pathway to an Automated Approach Within the Medical Informatics for Research and Care in University Medicine Consortium.","authors":"Gaetan Kamdje Wabo, Preetha Moorthy, Fabian Siegel, Susanne A Seuchter, Thomas Ganslandt","doi":"10.2196/57153","DOIUrl":"10.2196/57153","url":null,"abstract":"<p><strong>Background: </strong>Leveraging electronic health record (EHR) data for clinical or research purposes heavily depends on data fitness. However, there is a lack of standardized frameworks to evaluate EHR data suitability, leading to inconsistent quality in data use projects (DUPs). This research focuses on the Medical Informatics for Research and Care in University Medicine (MIRACUM) Data Integration Centers (DICs) and examines empirical practices on assessing and automating the fitness-for-purpose of clinical data in German DIC settings.</p><p><strong>Objective: </strong>The study aims (1) to capture and discuss how MIRACUM DICs evaluate and enhance the fitness-for-purpose of observational health care data and examine the alignment with existing recommendations and (2) to identify the requirements for designing and implementing a computer-assisted solution to evaluate EHR data fitness within MIRACUM DICs.</p><p><strong>Methods: </strong>A qualitative approach was followed using an open-ended survey across DICs of 10 German university hospitals affiliated with MIRACUM. Data were analyzed using thematic analysis following an inductive qualitative method.</p><p><strong>Results: </strong>All 10 MIRACUM DICs participated, with 17 participants revealing various approaches to assessing data fitness, including the 4-eyes principle and data consistency checks such as cross-system data value comparison. Common practices included a DUP-related feedback loop on data fitness and using self-designed dashboards for monitoring. Most experts had a computer science background and a master's degree, suggesting strong technological proficiency but potentially lacking clinical or statistical expertise. Nine key requirements for a computer-assisted solution were identified, including flexibility, understandability, extendibility, and practicability. Participants used heterogeneous data repositories for evaluating data quality criteria and practical strategies to communicate with research and clinical teams.</p><p><strong>Conclusions: </strong>The study identifies gaps between current practices in MIRACUM DICs and existing recommendations, offering insights into the complexities of assessing and reporting clinical data fitness. Additionally, a tripartite modular framework for fitness-for-purpose assessment was introduced to streamline the forthcoming implementation. It provides valuable input for developing and integrating an automated solution across multiple locations. This may include statistical comparisons to advanced machine learning algorithms for operationalizing frameworks such as the 3×3 data quality assessment framework. These findings provide foundational evidence for future design and implementation studies to enhance data quality assessments for specific DUPs in observational health care settings.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e57153"},"PeriodicalIF":3.1,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11369535/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142001479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Meghan Marsh, Syeda Rafia Shah, Sarah E P Munce, Laure Perrier, Tin-Suet Joan Lee, Tracey J F Colella, Kristina Marie Kokorelias
{"title":"Characteristics of Existing Online Patient Navigation Interventions: Scoping Review.","authors":"Meghan Marsh, Syeda Rafia Shah, Sarah E P Munce, Laure Perrier, Tin-Suet Joan Lee, Tracey J F Colella, Kristina Marie Kokorelias","doi":"10.2196/50307","DOIUrl":"10.2196/50307","url":null,"abstract":"<p><strong>Background: </strong>Patient navigation interventions (PNIs) can provide personalized support and promote appropriate coordination or continuation of health and social care services. Online PNIs have demonstrated excellent potential for improving patient knowledge, transition readiness, self-efficacy, and use of services. However, the characteristics (ie, intervention type, mode of delivery, duration, frequency, outcomes and outcome measures, underlying theories or mechanisms of change of the intervention, and impact) of existing online PNIs to support the health and social needs of individuals with illness remain unclear.</p><p><strong>Objective: </strong>This scoping review of the existing literature aims to identify the characteristics of existing online PNIs reported in the literature.</p><p><strong>Methods: </strong>A scoping review based on the guidelines outlined in the Joanna Briggs Institute framework was conducted. A search for peer-reviewed literature published between 1989 and 2022 on online PNIs was conducted using MEDLINE, CINAHL, Embase, PsycInfo, and Cochrane Library databases. Two independent reviewers conducted 2 levels of screening. Data abstraction was conducted to outline key study characteristics (eg, study design, population, and intervention characteristics). The data were analyzed using descriptive statistics and qualitative content analysis.</p><p><strong>Results: </strong>A total of 100 studies met the inclusion criteria. Our findings indicate that a variety of study designs are used to describe and evaluate online PNIs, with literature being published between 2003 and 2022 in Western countries. Of these studies, 39 (39%) studies were randomized controlled trials. In addition, we noticed an increase in reported online PNIs since 2019. The majority of studies involved White females with a diagnosis of cancer and a lack of participants aged 70 years or older was observed. Most online PNIs provide support through navigation, self-management and lifestyle changes, counseling, coaching, education, or a combination of support. Variation was noted in terms of mode of delivery, duration, and frequency. Only a small number of studies described theoretical frameworks or change mechanisms to guide intervention.</p><p><strong>Conclusions: </strong>To our knowledge, this is the first review to comprehensively synthesize the existing literature on online PNIs, by focusing on the characteristics of interventions and studies in this area. Inconsistency in reporting the country of publication, population characteristics, duration and frequency of interventions, and a lack of the use of underlying theories and working mechanisms to inform intervention development, provide guidance for the reporting of future online PNIs.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e50307"},"PeriodicalIF":3.1,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11369544/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142006045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bridging Real-World Data Gaps: Connecting Dots Across 10 Asian Countries.","authors":"Guilherme Silva Julian, Wen-Yi Shau, Hsu-Wen Chou, Sajita Setia","doi":"10.2196/58548","DOIUrl":"10.2196/58548","url":null,"abstract":"<p><p>The economic trend and the health care landscape are rapidly evolving across Asia. Effective real-world data (RWD) for regulatory and clinical decision-making is a crucial milestone associated with this evolution. This necessitates a critical evaluation of RWD generation within distinct nations for the use of various RWD warehouses in the generation of real-world evidence (RWE). In this article, we outline the RWD generation trends for 2 contrasting nation archetypes: \"Solo Scholars\"-nations with relatively self-sufficient RWD research systems-and \"Global Collaborators\"-countries largely reliant on international infrastructures for RWD generation. The key trends and patterns in RWD generation, country-specific insights into the predominant databases used in each country to produce RWE, and insights into the broader landscape of RWD database use across these countries are discussed. Conclusively, the data point out the heterogeneous nature of RWD generation practices across 10 different Asian nations and advocate for strategic enhancements in data harmonization. The evidence highlights the imperative for improved database integration and the establishment of standardized protocols and infrastructure for leveraging electronic medical records (EMR) in streamlining RWD acquisition. The clinical data analysis and reporting system of Hong Kong is an excellent example of a successful EMR system that showcases the capacity of integrated robust EMR platforms to consolidate and produce diverse RWE. This, in turn, can potentially reduce the necessity for reliance on numerous condition-specific local and global registries or limited and largely unavailable medical insurance or claims databases in most Asian nations. Linking health technology assessment processes with open data initiatives such as the Observational Medical Outcomes Partnership Common Data Model and the Observational Health Data Sciences and Informatics could enable the leveraging of global data resources to inform local decision-making. Advancing such initiatives is crucial for reinforcing health care frameworks in resource-limited settings and advancing toward cohesive, evidence-driven health care policy and improved patient outcomes in the region.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":" ","pages":"e58548"},"PeriodicalIF":3.1,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362708/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dawid Szumilas, Anna Ochmann, Katarzyna Zięba, Bartłomiej Bartoszewicz, Anna Kubrak, Sebastian Makuch, Siddarth Agrawal, Grzegorz Mazur, Jerzy Chudek
{"title":"Evaluation of AI-Driven LabTest Checker for Diagnostic Accuracy and Safety: Prospective Cohort Study.","authors":"Dawid Szumilas, Anna Ochmann, Katarzyna Zięba, Bartłomiej Bartoszewicz, Anna Kubrak, Sebastian Makuch, Siddarth Agrawal, Grzegorz Mazur, Jerzy Chudek","doi":"10.2196/57162","DOIUrl":"10.2196/57162","url":null,"abstract":"<p><strong>Background: </strong>In recent years, the implementation of artificial intelligence (AI) in health care is progressively transforming medical fields, with the use of clinical decision support systems (CDSSs) as a notable application. Laboratory tests are vital for accurate diagnoses, but their increasing reliance presents challenges. The need for effective strategies for managing laboratory test interpretation is evident from the millions of monthly searches on test results' significance. As the potential role of CDSSs in laboratory diagnostics gains significance, however, more research is needed to explore this area.</p><p><strong>Objective: </strong>The primary objective of our study was to assess the accuracy and safety of LabTest Checker (LTC), a CDSS designed to support medical diagnoses by analyzing both laboratory test results and patients' medical histories.</p><p><strong>Methods: </strong>This cohort study embraced a prospective data collection approach. A total of 101 patients aged ≥18 years, in stable condition, and requiring comprehensive diagnosis were enrolled. A panel of blood laboratory tests was conducted for each participant. Participants used LTC for test result interpretation. The accuracy and safety of the tool were assessed by comparing AI-generated suggestions to experienced doctor (consultant) recommendations, which are considered the gold standard.</p><p><strong>Results: </strong>The system achieved a 74.3% accuracy and 100% sensitivity for emergency safety and 92.3% sensitivity for urgent cases. It potentially reduced unnecessary medical visits by 41.6% (42/101) and achieved an 82.9% accuracy in identifying underlying pathologies.</p><p><strong>Conclusions: </strong>This study underscores the transformative potential of AI-based CDSSs in laboratory diagnostics, contributing to enhanced patient care, efficient health care systems, and improved medical outcomes. LTC's performance evaluation highlights the advancements in AI's role in laboratory medicine.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e57162"},"PeriodicalIF":3.1,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11337233/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141989620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mathilde Fruchart, Paul Quindroit, Chloé Jacquemont, Jean-Baptiste Beuscart, Matthieu Calafiore, Antoine Lamer
{"title":"Transforming Primary Care Data Into the Observational Medical Outcomes Partnership Common Data Model: Development and Usability Study.","authors":"Mathilde Fruchart, Paul Quindroit, Chloé Jacquemont, Jean-Baptiste Beuscart, Matthieu Calafiore, Antoine Lamer","doi":"10.2196/49542","DOIUrl":"10.2196/49542","url":null,"abstract":"<p><strong>Background: </strong>Patient-monitoring software generates a large amount of data that can be reused for clinical audits and scientific research. The Observational Health Data Sciences and Informatics (OHDSI) consortium developed the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) to standardize electronic health record data and promote large-scale observational and longitudinal research.</p><p><strong>Objective: </strong>This study aimed to transform primary care data into the OMOP CDM format.</p><p><strong>Methods: </strong>We extracted primary care data from electronic health records at a multidisciplinary health center in Wattrelos, France. We performed structural mapping between the design of our local primary care database and the OMOP CDM tables and fields. Local French vocabularies concepts were mapped to OHDSI standard vocabularies. To validate the implementation of primary care data into the OMOP CDM format, we applied a set of queries. A practical application was achieved through the development of a dashboard.</p><p><strong>Results: </strong>Data from 18,395 patients were implemented into the OMOP CDM, corresponding to 592,226 consultations over a period of 20 years. A total of 18 OMOP CDM tables were implemented. A total of 17 local vocabularies were identified as being related to primary care and corresponded to patient characteristics (sex, location, year of birth, and race), units of measurement, biometric measures, laboratory test results, medical histories, and drug prescriptions. During semantic mapping, 10,221 primary care concepts were mapped to standard OHDSI concepts. Five queries were used to validate the OMOP CDM by comparing the results obtained after the completion of the transformations with the results obtained in the source software. Lastly, a prototype dashboard was developed to visualize the activity of the health center, the laboratory test results, and the drug prescription data.</p><p><strong>Conclusions: </strong>Primary care data from a French health care facility have been implemented into the OMOP CDM format. Data concerning demographics, units, measurements, and primary care consultation steps were already available in OHDSI vocabularies. Laboratory test results and drug prescription data were mapped to available vocabularies and structured in the final model. A dashboard application provided health care professionals with feedback on their practice.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e49542"},"PeriodicalIF":3.1,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11337138/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141977388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alberto De Ramón Fernández, Daniel Ruiz Fernández, Miguel García Jaén, Juan M Cortell-Tormo
{"title":"Recognition of Daily Activities in Adults With Wearable Inertial Sensors: Deep Learning Methods Study.","authors":"Alberto De Ramón Fernández, Daniel Ruiz Fernández, Miguel García Jaén, Juan M Cortell-Tormo","doi":"10.2196/57097","DOIUrl":"10.2196/57097","url":null,"abstract":"<p><strong>Background: </strong>Activities of daily living (ADL) are essential for independence and personal well-being, reflecting an individual's functional status. Impairment in executing these tasks can limit autonomy and negatively affect quality of life. The assessment of physical function during ADL is crucial for the prevention and rehabilitation of movement limitations. Still, its traditional evaluation based on subjective observation has limitations in precision and objectivity.</p><p><strong>Objective: </strong>The primary objective of this study is to use innovative technology, specifically wearable inertial sensors combined with artificial intelligence techniques, to objectively and accurately evaluate human performance in ADL. It is proposed to overcome the limitations of traditional methods by implementing systems that allow dynamic and noninvasive monitoring of movements during daily activities. The approach seeks to provide an effective tool for the early detection of dysfunctions and the personalization of treatment and rehabilitation plans, thus promoting an improvement in the quality of life of individuals.</p><p><strong>Methods: </strong>To monitor movements, wearable inertial sensors were developed, which include accelerometers and triaxial gyroscopes. The developed sensors were used to create a proprietary database with 6 movements related to the shoulder and 3 related to the back. We registered 53,165 activity records in the database (consisting of accelerometer and gyroscope measurements), which were reduced to 52,600 after processing to remove null or abnormal values. Finally, 4 deep learning (DL) models were created by combining various processing layers to explore different approaches in ADL recognition.</p><p><strong>Results: </strong>The results revealed high performance of the 4 proposed models, with levels of accuracy, precision, recall, and F<sub>1</sub>-score ranging between 95% and 97% for all classes and an average loss of 0.10. These results indicate the great capacity of the models to accurately identify a variety of activities, with a good balance between precision and recall. Both the convolutional and bidirectional approaches achieved slightly superior results, although the bidirectional model reached convergence in a smaller number of epochs.</p><p><strong>Conclusions: </strong>The DL models implemented have demonstrated solid performance, indicating an effective ability to identify and classify various daily activities related to the shoulder and lumbar region. These results were achieved with minimal sensorization-being noninvasive and practically imperceptible to the user-which does not affect their daily routine and promotes acceptance and adherence to continuous monitoring, thus improving the reliability of the data collected. This research has the potential to have a significant impact on the clinical evaluation and rehabilitation of patients with movement limitations, by providing an objective and ","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e57097"},"PeriodicalIF":3.1,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11344189/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141910167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Assessing ChatGPT as a Medical Consultation Assistant for Chronic Hepatitis B: Cross-Language Study of English and Chinese.","authors":"Yijie Wang, Yining Chen, Jifang Sheng","doi":"10.2196/56426","DOIUrl":"10.2196/56426","url":null,"abstract":"<p><strong>Background: </strong>Chronic hepatitis B (CHB) imposes substantial economic and social burdens globally. The management of CHB involves intricate monitoring and adherence challenges, particularly in regions like China, where a high prevalence of CHB intersects with health care resource limitations. This study explores the potential of ChatGPT-3.5, an emerging artificial intelligence (AI) assistant, to address these complexities. With notable capabilities in medical education and practice, ChatGPT-3.5's role is examined in managing CHB, particularly in regions with distinct health care landscapes.</p><p><strong>Objective: </strong>This study aimed to uncover insights into ChatGPT-3.5's potential and limitations in delivering personalized medical consultation assistance for CHB patients across diverse linguistic contexts.</p><p><strong>Methods: </strong>Questions sourced from published guidelines, online CHB communities, and search engines in English and Chinese were refined, translated, and compiled into 96 inquiries. Subsequently, these questions were presented to both ChatGPT-3.5 and ChatGPT-4.0 in independent dialogues. The responses were then evaluated by senior physicians, focusing on informativeness, emotional management, consistency across repeated inquiries, and cautionary statements regarding medical advice. Additionally, a true-or-false questionnaire was employed to further discern the variance in information accuracy for closed questions between ChatGPT-3.5 and ChatGPT-4.0.</p><p><strong>Results: </strong>Over half of the responses (228/370, 61.6%) from ChatGPT-3.5 were considered comprehensive. In contrast, ChatGPT-4.0 exhibited a higher percentage at 74.5% (172/222; P<.001). Notably, superior performance was evident in English, particularly in terms of informativeness and consistency across repeated queries. However, deficiencies were identified in emotional management guidance, with only 3.2% (6/186) in ChatGPT-3.5 and 8.1% (15/154) in ChatGPT-4.0 (P=.04). ChatGPT-3.5 included a disclaimer in 10.8% (24/222) of responses, while ChatGPT-4.0 included a disclaimer in 13.1% (29/222) of responses (P=.46). When responding to true-or-false questions, ChatGPT-4.0 achieved an accuracy rate of 93.3% (168/180), significantly surpassing ChatGPT-3.5's accuracy rate of 65.0% (117/180) (P<.001).</p><p><strong>Conclusions: </strong>In this study, ChatGPT demonstrated basic capabilities as a medical consultation assistant for CHB management. The choice of working language for ChatGPT-3.5 was considered a potential factor influencing its performance, particularly in the use of terminology and colloquial language, and this potentially affects its applicability within specific target populations. However, as an updated model, ChatGPT-4.0 exhibits improved information processing capabilities, overcoming the language impact on information accuracy. This suggests that the implications of model advancement on applications need to be considered whe","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e56426"},"PeriodicalIF":3.1,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11342014/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141903716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziyu Zhu, Lan Liu, Min Du, Mao Ye, Ximing Xu, Ying Xu
{"title":"Pediatric Sedation Assessment and Management System (PSAMS) for Pediatric Sedation in China: Development and Implementation Report.","authors":"Ziyu Zhu, Lan Liu, Min Du, Mao Ye, Ximing Xu, Ying Xu","doi":"10.2196/53427","DOIUrl":"10.2196/53427","url":null,"abstract":"<p><strong>Background: </strong>Recently, the growing demand for pediatric sedation services outside the operating room has imposed a heavy burden on pediatric centers in China. There is an urgent need to develop a novel system for improved sedation services.</p><p><strong>Objective: </strong>This study aimed to develop and implement a computerized system, the Pediatric Sedation Assessment and Management System (PSAMS), to streamline pediatric sedation services at a major children's hospital in Southwest China.</p><p><strong>Methods: </strong>PSAMS was designed to reflect the actual workflow of pediatric sedation. It consists of 3 main components: server-hosted software; client applications on tablets and computers; and specialized devices like gun-type scanners, desktop label printers, and pulse oximeters. With the participation of a multidisciplinary team, PSAMS was developed and refined during its application in the sedation process. This study analyzed data from the first 2 years after the system's deployment.</p><p><strong>Unlabelled: </strong>From January 2020 to December 2021, a total of 127,325 sedations were performed on 85,281 patients using the PSAMS database. Besides basic variables imported from Hospital Information Systems (HIS), the PSAMS database currently contains 33 additional variables that capture comprehensive information from presedation assessment to postprocedural recovery. The recorded data from PSAMS indicates a one-time sedation success rate of 97.1% (50,752/52,282) in 2020 and 97.5% (73,184/75,043) in 2021. The observed adverse events rate was 3.5% (95% CI 3.4%-3.7%) in 2020 and 2.8% (95% CI 2.7%-2.9%) in 2021.</p><p><strong>Conclusions: </strong>PSAMS streamlined the entire sedation workflow, reduced the burden of data collection, and laid a foundation for future cooperation of multiple pediatric health care centers.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e53427"},"PeriodicalIF":3.1,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11322794/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141903717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xu Liu, Chaoli Duan, Min-Kyu Kim, Lu Zhang, Eunjin Jee, Beenu Maharjan, Yuwei Huang, Dan Du, Xian Jiang
{"title":"Claude 3 Opus and ChatGPT With GPT-4 in Dermoscopic Image Analysis for Melanoma Diagnosis: Comparative Performance Analysis.","authors":"Xu Liu, Chaoli Duan, Min-Kyu Kim, Lu Zhang, Eunjin Jee, Beenu Maharjan, Yuwei Huang, Dan Du, Xian Jiang","doi":"10.2196/59273","DOIUrl":"10.2196/59273","url":null,"abstract":"<p><strong>Background: </strong>Recent advancements in artificial intelligence (AI) and large language models (LLMs) have shown potential in medical fields, including dermatology. With the introduction of image analysis capabilities in LLMs, their application in dermatological diagnostics has garnered significant interest. These capabilities are enabled by the integration of computer vision techniques into the underlying architecture of LLMs.</p><p><strong>Objective: </strong>This study aimed to compare the diagnostic performance of Claude 3 Opus and ChatGPT with GPT-4 in analyzing dermoscopic images for melanoma detection, providing insights into their strengths and limitations.</p><p><strong>Methods: </strong>We randomly selected 100 histopathology-confirmed dermoscopic images (50 malignant, 50 benign) from the International Skin Imaging Collaboration (ISIC) archive using a computer-generated randomization process. The ISIC archive was chosen due to its comprehensive and well-annotated collection of dermoscopic images, ensuring a diverse and representative sample. Images were included if they were dermoscopic images of melanocytic lesions with histopathologically confirmed diagnoses. Each model was given the same prompt, instructing it to provide the top 3 differential diagnoses for each image, ranked by likelihood. Primary diagnosis accuracy, accuracy of the top 3 differential diagnoses, and malignancy discrimination ability were assessed. The McNemar test was chosen to compare the diagnostic performance of the 2 models, as it is suitable for analyzing paired nominal data.</p><p><strong>Results: </strong>In the primary diagnosis, Claude 3 Opus achieved 54.9% sensitivity (95% CI 44.08%-65.37%), 57.14% specificity (95% CI 46.31%-67.46%), and 56% accuracy (95% CI 46.22%-65.42%), while ChatGPT demonstrated 56.86% sensitivity (95% CI 45.99%-67.21%), 38.78% specificity (95% CI 28.77%-49.59%), and 48% accuracy (95% CI 38.37%-57.75%). The McNemar test showed no significant difference between the 2 models (P=.17). For the top 3 differential diagnoses, Claude 3 Opus and ChatGPT included the correct diagnosis in 76% (95% CI 66.33%-83.77%) and 78% (95% CI 68.46%-85.45%) of cases, respectively. The McNemar test showed no significant difference (P=.56). In malignancy discrimination, Claude 3 Opus outperformed ChatGPT with 47.06% sensitivity, 81.63% specificity, and 64% accuracy, compared to 45.1%, 42.86%, and 44%, respectively. The McNemar test showed a significant difference (P<.001). Claude 3 Opus had an odds ratio of 3.951 (95% CI 1.685-9.263) in discriminating malignancy, while ChatGPT-4 had an odds ratio of 0.616 (95% CI 0.297-1.278).</p><p><strong>Conclusions: </strong>Our study highlights the potential of LLMs in assisting dermatologists but also reveals their limitations. Both models made errors in diagnosing melanoma and benign lesions. These findings underscore the need for developing robust, transparent, and clinically validated AI models through","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e59273"},"PeriodicalIF":3.1,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11336503/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141898999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}