JMIR AI最新文献

筛选
英文 中文
Toward Clinical Generative AI: Conceptual Framework. 迈向临床生成式人工智能:概念框架。
JMIR AI Pub Date : 2024-06-07 DOI: 10.2196/55957
Nicola Luigi Bragazzi, Sergio Garbarino
{"title":"Toward Clinical Generative AI: Conceptual Framework.","authors":"Nicola Luigi Bragazzi, Sergio Garbarino","doi":"10.2196/55957","DOIUrl":"10.2196/55957","url":null,"abstract":"<p><p>Clinical decision-making is a crucial aspect of health care, involving the balanced integration of scientific evidence, clinical judgment, ethical considerations, and patient involvement. This process is dynamic and multifaceted, relying on clinicians' knowledge, experience, and intuitive understanding to achieve optimal patient outcomes through informed, evidence-based choices. The advent of generative artificial intelligence (AI) presents a revolutionary opportunity in clinical decision-making. AI's advanced data analysis and pattern recognition capabilities can significantly enhance the diagnosis and treatment of diseases, processing vast medical data to identify patterns, tailor treatments, predict disease progression, and aid in proactive patient management. However, the incorporation of AI into clinical decision-making raises concerns regarding the reliability and accuracy of AI-generated insights. To address these concerns, 11 \"verification paradigms\" are proposed in this paper, with each paradigm being a unique method to verify the evidence-based nature of AI in clinical decision-making. This paper also frames the concept of \"clinically explainable, fair, and responsible, clinician-, expert-, and patient-in-the-loop AI.\" This model focuses on ensuring AI's comprehensibility, collaborative nature, and ethical grounding, advocating for AI to serve as an augmentative tool, with its decision-making processes being transparent and understandable to clinicians and patients. The integration of AI should enhance, not replace, the clinician's judgment and should involve continuous learning and adaptation based on real-world outcomes and ethical and legal compliance. In conclusion, while generative AI holds immense promise in enhancing clinical decision-making, it is essential to ensure that it produces evidence-based, reliable, and impactful knowledge. Using the outlined paradigms and approaches can help the medical and patient communities harness AI's potential while maintaining high patient care standards.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e55957"},"PeriodicalIF":0.0,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11193080/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding the Long Haulers of COVID-19: Mixed Methods Analysis of YouTube Content. 了解 COVID-19 的长途运输者:对 YouTube 内容的混合方法分析。
JMIR AI Pub Date : 2024-06-03 DOI: 10.2196/54501
Alexis Jordan, Albert Park
{"title":"Understanding the Long Haulers of COVID-19: Mixed Methods Analysis of YouTube Content.","authors":"Alexis Jordan, Albert Park","doi":"10.2196/54501","DOIUrl":"10.2196/54501","url":null,"abstract":"<p><strong>Background: </strong>The COVID-19 pandemic had a devastating global impact. In the United States, there were >98 million COVID-19 cases and >1 million resulting deaths. One consequence of COVID-19 infection has been post-COVID-19 condition (PCC). People with this syndrome, colloquially called long haulers, experience symptoms that impact their quality of life. The root cause of PCC and effective treatments remains unknown. Many long haulers have turned to social media for support and guidance.</p><p><strong>Objective: </strong>In this study, we sought to gain a better understanding of the long hauler experience by investigating what has been discussed and how information about long haulers is perceived on social media. We specifically investigated the following: (1) the range of symptoms that are discussed, (2) the ways in which information about long haulers is perceived, (3) informational and emotional support that is available to long haulers, and (4) discourse between viewers and creators. We selected YouTube as our data source due to its popularity and wide range of audience.</p><p><strong>Methods: </strong>We systematically gathered data from 3 different types of content creators: medical sources, news sources, and long haulers. To computationally understand the video content and viewers' reactions, we used Biterm, a topic modeling algorithm created specifically for short texts, to analyze snippets of video transcripts and all top-level comments from the comment section. To triangulate our findings about viewers' reactions, we used the Valence Aware Dictionary and Sentiment Reasoner to conduct sentiment analysis on comments from each type of content creator. We grouped the comments into positive and negative categories and generated topics for these groups using Biterm. We then manually grouped resulting topics into broader themes for the purpose of analysis.</p><p><strong>Results: </strong>We organized the resulting topics into 28 themes across all sources. Examples of medical source transcript themes were Explanations in layman's terms and Biological explanations. Examples of news source transcript themes were Negative experiences and handling the long haul. The 2 long hauler transcript themes were Taking treatments into own hands and Changes to daily life. News sources received a greater share of negative comments. A few themes of these negative comments included Misinformation and disinformation and Issues with the health care system. Similarly, negative long hauler comments were organized into several themes, including Disillusionment with the health care system and Requiring more visibility. In contrast, positive medical source comments captured themes such as Appreciation of helpful content and Exchange of helpful information. In addition to this theme, one positive theme found in long hauler comments was Community building.</p><p><strong>Conclusions: </strong>The results of this study could help public health agencies, po","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e54501"},"PeriodicalIF":0.0,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11184269/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feasibility of Multimodal Artificial Intelligence Using GPT-4 Vision for the Classification of Middle Ear Disease: Qualitative Study and Validation. 利用 GPT-4 视觉进行中耳疾病分类的多模态人工智能的可行性:定性研究与验证。
JMIR AI Pub Date : 2024-05-31 DOI: 10.2196/58342
Masao Noda, Hidekane Yoshimura, Takuya Okubo, Ryota Koshu, Yuki Uchiyama, Akihiro Nomura, Makoto Ito, Yutaka Takumi
{"title":"Feasibility of Multimodal Artificial Intelligence Using GPT-4 Vision for the Classification of Middle Ear Disease: Qualitative Study and Validation.","authors":"Masao Noda, Hidekane Yoshimura, Takuya Okubo, Ryota Koshu, Yuki Uchiyama, Akihiro Nomura, Makoto Ito, Yutaka Takumi","doi":"10.2196/58342","DOIUrl":"10.2196/58342","url":null,"abstract":"<p><strong>Background: </strong>The integration of artificial intelligence (AI), particularly deep learning models, has transformed the landscape of medical technology, especially in the field of diagnosis using imaging and physiological data. In otolaryngology, AI has shown promise in image classification for middle ear diseases. However, existing models often lack patient-specific data and clinical context, limiting their universal applicability. The emergence of GPT-4 Vision (GPT-4V) has enabled a multimodal diagnostic approach, integrating language processing with image analysis.</p><p><strong>Objective: </strong>In this study, we investigated the effectiveness of GPT-4V in diagnosing middle ear diseases by integrating patient-specific data with otoscopic images of the tympanic membrane.</p><p><strong>Methods: </strong>The design of this study was divided into two phases: (1) establishing a model with appropriate prompts and (2) validating the ability of the optimal prompt model to classify images. In total, 305 otoscopic images of 4 middle ear diseases (acute otitis media, middle ear cholesteatoma, chronic otitis media, and otitis media with effusion) were obtained from patients who visited Shinshu University or Jichi Medical University between April 2010 and December 2023. The optimized GPT-4V settings were established using prompts and patients' data, and the model created with the optimal prompt was used to verify the diagnostic accuracy of GPT-4V on 190 images. To compare the diagnostic accuracy of GPT-4V with that of physicians, 30 clinicians completed a web-based questionnaire consisting of 190 images.</p><p><strong>Results: </strong>The multimodal AI approach achieved an accuracy of 82.1%, which is superior to that of certified pediatricians at 70.6%, but trailing behind that of otolaryngologists at more than 95%. The model's disease-specific accuracy rates were 89.2% for acute otitis media, 76.5% for chronic otitis media, 79.3% for middle ear cholesteatoma, and 85.7% for otitis media with effusion, which highlights the need for disease-specific optimization. Comparisons with physicians revealed promising results, suggesting the potential of GPT-4V to augment clinical decision-making.</p><p><strong>Conclusions: </strong>Despite its advantages, challenges such as data privacy and ethical considerations must be addressed. Overall, this study underscores the potential of multimodal AI for enhancing diagnostic accuracy and improving patient care in otolaryngology. Further research is warranted to optimize and validate this approach in diverse clinical settings.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e58342"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11179042/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying Patterns of Smoking Cessation App Feature Use That Predict Successful Quitting: Secondary Analysis of Experimental Data Leveraging Machine Learning 识别可预测成功戒烟的戒烟应用程序功能使用模式:利用机器学习对实验数据进行二次分析
JMIR AI Pub Date : 2024-05-22 DOI: 10.2196/51756
L. N. Siegel, Kara P Wiseman, Alexandra Budenz, Yvonne M Prutzman
{"title":"Identifying Patterns of Smoking Cessation App Feature Use That Predict Successful Quitting: Secondary Analysis of Experimental Data Leveraging Machine Learning","authors":"L. N. Siegel, Kara P Wiseman, Alexandra Budenz, Yvonne M Prutzman","doi":"10.2196/51756","DOIUrl":"https://doi.org/10.2196/51756","url":null,"abstract":"\u0000 \u0000 Leveraging free smartphone apps can help expand the availability and use of evidence-based smoking cessation interventions. However, there is a need for additional research investigating how the use of different features within such apps impacts their effectiveness.\u0000 \u0000 \u0000 \u0000 We used observational data collected from an experiment of a publicly available smoking cessation app to develop supervised machine learning (SML) algorithms intended to distinguish the app features that promote successful smoking cessation. We then assessed the extent to which patterns of app feature use accounted for variance in cessation that could not be explained by other known predictors of cessation (eg, tobacco use behaviors).\u0000 \u0000 \u0000 \u0000 Data came from an experiment (ClinicalTrials.gov NCT04623736) testing the impacts of incentivizing ecological momentary assessments within the National Cancer Institute’s quitSTART app. Participants’ (N=133) app activity, including every action they took within the app and its corresponding time stamp, was recorded. Demographic and baseline tobacco use characteristics were measured at the start of the experiment, and short-term smoking cessation (7-day point prevalence abstinence) was measured at 4 weeks after baseline. Logistic regression SML modeling was used to estimate participants’ probability of cessation from 28 variables reflecting participants’ use of different app features, assigned experimental conditions, and phone type (iPhone [Apple Inc] or Android [Google]). The SML model was first fit in a training set (n=100) and then its accuracy was assessed in a held-aside test set (n=33). Within the test set, a likelihood ratio test (n=30) assessed whether adding individuals’ SML-predicted probabilities of cessation to a logistic regression model that included demographic and tobacco use (eg, polyuse) variables explained additional variance in 4-week cessation.\u0000 \u0000 \u0000 \u0000 The SML model’s sensitivity (0.67) and specificity (0.67) in the held-aside test set indicated that individuals’ patterns of using different app features predicted cessation with reasonable accuracy. The likelihood ratio test showed that the logistic regression, which included the SML model–predicted probabilities, was statistically equivalent to the model that only included the demographic and tobacco use variables (P=.16).\u0000 \u0000 \u0000 \u0000 Harnessing user data through SML could help determine the features of smoking cessation apps that are most useful. This methodological approach could be applied in future research focusing on smoking cessation app features to inform the development and improvement of smoking cessation apps.\u0000 \u0000 \u0000 \u0000 ClinicalTrials.gov NCT04623736; https://clinicaltrials.gov/study/NCT04623736\u0000","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141110022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Framework for Ranking Machine Learning Predictions of Limited, Multimodal, and Longitudinal Behavioral Passive Sensing Data: Combining User-Agnostic and Personalized Modeling 对有限、多模态和纵向行为被动传感数据进行机器学习预测排名的框架:用户诊断与个性化建模相结合
JMIR AI Pub Date : 2024-05-20 DOI: 10.2196/47805
Tahsin Mullick, Samy Shaaban, A. Radovic, Afsaneh Doryab
{"title":"Framework for Ranking Machine Learning Predictions of Limited, Multimodal, and Longitudinal Behavioral Passive Sensing Data: Combining User-Agnostic and Personalized Modeling","authors":"Tahsin Mullick, Samy Shaaban, A. Radovic, Afsaneh Doryab","doi":"10.2196/47805","DOIUrl":"https://doi.org/10.2196/47805","url":null,"abstract":"\u0000 \u0000 Passive mobile sensing provides opportunities for measuring and monitoring health status in the wild and outside of clinics. However, longitudinal, multimodal mobile sensor data can be small, noisy, and incomplete. This makes processing, modeling, and prediction of these data challenging. The small size of the data set restricts it from being modeled using complex deep learning networks. The current state of the art (SOTA) tackles small sensor data sets following a singular modeling paradigm based on traditional machine learning (ML) algorithms. These opt for either a user-agnostic modeling approach, making the model susceptible to a larger degree of noise, or a personalized approach, where training on individual data alludes to a more limited data set, giving rise to overfitting, therefore, ultimately, having to seek a trade-off by choosing 1 of the 2 modeling approaches to reach predictions.\u0000 \u0000 \u0000 \u0000 The objective of this study was to filter, rank, and output the best predictions for small, multimodal, longitudinal sensor data using a framework that is designed to tackle data sets that are limited in size (particularly targeting health studies that use passive multimodal sensors) and that combines both user agnostic and personalized approaches, along with a combination of ranking strategies to filter predictions.\u0000 \u0000 \u0000 \u0000 In this paper, we introduced a novel ranking framework for longitudinal multimodal sensors (FLMS) to address challenges encountered in health studies involving passive multimodal sensors. Using the FLMS, we (1) built a tensor-based aggregation and ranking strategy for final interpretation, (2) processed various combinations of sensor fusions, and (3) balanced user-agnostic and personalized modeling approaches with appropriate cross-validation strategies. The performance of the FLMS was validated with the help of a real data set of adolescents diagnosed with major depressive disorder for the prediction of change in depression in the adolescent participants.\u0000 \u0000 \u0000 \u0000 Predictions output by the proposed FLMS achieved a 7% increase in accuracy and a 13% increase in recall for the real data set. Experiments with existing SOTA ML algorithms showed an 11% increase in accuracy for the depression data set and how overfitting and sparsity were handled.\u0000 \u0000 \u0000 \u0000 The FLMS aims to fill the gap that currently exists when modeling passive sensor data with a small number of data points. It achieves this through leveraging both user-agnostic and personalized modeling techniques in tandem with an effective ranking strategy to filter predictions.\u0000","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"15 22","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141121044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving Risk Prediction of Methicillin-Resistant Staphylococcus aureus Using Machine Learning Methods With Network Features: Retrospective Development Study. 利用具有网络特征的机器学习方法改进耐甲氧西林金黄色葡萄球菌的风险预测:回顾性发展研究。
JMIR AI Pub Date : 2024-05-16 DOI: 10.2196/48067
Methun Kamruzzaman, Jack Heavey, Alexander Song, Matthew Bielskas, Parantapa Bhattacharya, Gregory Madden, Eili Klein, Xinwei Deng, Anil Vullikanti
{"title":"Improving Risk Prediction of Methicillin-Resistant Staphylococcus aureus Using Machine Learning Methods With Network Features: Retrospective Development Study.","authors":"Methun Kamruzzaman, Jack Heavey, Alexander Song, Matthew Bielskas, Parantapa Bhattacharya, Gregory Madden, Eili Klein, Xinwei Deng, Anil Vullikanti","doi":"10.2196/48067","DOIUrl":"10.2196/48067","url":null,"abstract":"<p><strong>Background: </strong>Health care-associated infections due to multidrug-resistant organisms (MDROs), such as methicillin-resistant Staphylococcus aureus (MRSA) and Clostridioides difficile (CDI), place a significant burden on our health care infrastructure.</p><p><strong>Objective: </strong>Screening for MDROs is an important mechanism for preventing spread but is resource intensive. The objective of this study was to develop automated tools that can predict colonization or infection risk using electronic health record (EHR) data, provide useful information to aid infection control, and guide empiric antibiotic coverage.</p><p><strong>Methods: </strong>We retrospectively developed a machine learning model to detect MRSA colonization and infection in undifferentiated patients at the time of sample collection from hospitalized patients at the University of Virginia Hospital. We used clinical and nonclinical features derived from on-admission and throughout-stay information from the patient's EHR data to build the model. In addition, we used a class of features derived from contact networks in EHR data; these network features can capture patients' contacts with providers and other patients, improving model interpretability and accuracy for predicting the outcome of surveillance tests for MRSA. Finally, we explored heterogeneous models for different patient subpopulations, for example, those admitted to an intensive care unit or emergency department or those with specific testing histories, which perform better.</p><p><strong>Results: </strong>We found that the penalized logistic regression performs better than other methods, and this model's performance measured in terms of its receiver operating characteristics-area under the curve score improves by nearly 11% when we use polynomial (second-degree) transformation of the features. Some significant features in predicting MDRO risk include antibiotic use, surgery, use of devices, dialysis, patient's comorbidity conditions, and network features. Among these, network features add the most value and improve the model's performance by at least 15%. The penalized logistic regression model with the same transformation of features also performs better than other models for specific patient subpopulations.</p><p><strong>Conclusions: </strong>Our study shows that MRSA risk prediction can be conducted quite effectively by machine learning methods using clinical and nonclinical features derived from EHR data. Network features are the most predictive and provide significant improvement over prior methods. Furthermore, heterogeneous prediction models for different patient subpopulations enhance the model's performance.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e48067"},"PeriodicalIF":0.0,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11140275/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sample Size Considerations for Fine-Tuning Large Language Models for Named Entity Recognition Tasks: Methodological Study 微调命名实体识别任务大型语言模型的样本量考虑因素:方法论研究
JMIR AI Pub Date : 2024-05-16 DOI: 10.2196/52095
Zoltan P. Majdik, S. S. Graham, Jade C Shiva Edward, Sabrina N Rodriguez, M. S. Karnes, Jared T Jensen, Joshua B Barbour, Justin F. Rousseau
{"title":"Sample Size Considerations for Fine-Tuning Large Language Models for Named Entity Recognition Tasks: Methodological Study","authors":"Zoltan P. Majdik, S. S. Graham, Jade C Shiva Edward, Sabrina N Rodriguez, M. S. Karnes, Jared T Jensen, Joshua B Barbour, Justin F. Rousseau","doi":"10.2196/52095","DOIUrl":"https://doi.org/10.2196/52095","url":null,"abstract":"\u0000 \u0000 Large language models (LLMs) have the potential to support promising new applications in health informatics. However, practical data on sample size considerations for fine-tuning LLMs to perform specific tasks in biomedical and health policy contexts are lacking.\u0000 \u0000 \u0000 \u0000 This study aims to evaluate sample size and sample selection techniques for fine-tuning LLMs to support improved named entity recognition (NER) for a custom data set of conflicts of interest disclosure statements.\u0000 \u0000 \u0000 \u0000 A random sample of 200 disclosure statements was prepared for annotation. All “PERSON” and “ORG” entities were identified by each of the 2 raters, and once appropriate agreement was established, the annotators independently annotated an additional 290 disclosure statements. From the 490 annotated documents, 2500 stratified random samples in different size ranges were drawn. The 2500 training set subsamples were used to fine-tune a selection of language models across 2 model architectures (Bidirectional Encoder Representations from Transformers [BERT] and Generative Pre-trained Transformer [GPT]) for improved NER, and multiple regression was used to assess the relationship between sample size (sentences), entity density (entities per sentence [EPS]), and trained model performance (F1-score). Additionally, single-predictor threshold regression models were used to evaluate the possibility of diminishing marginal returns from increased sample size or entity density.\u0000 \u0000 \u0000 \u0000 Fine-tuned models ranged in topline NER performance from F1-score=0.79 to F1-score=0.96 across architectures. Two-predictor multiple linear regression models were statistically significant with multiple R2 ranging from 0.6057 to 0.7896 (all P<.001). EPS and the number of sentences were significant predictors of F1-scores in all cases ( P<.001), except for the GPT-2_large model, where EPS was not a significant predictor (P=.184). Model thresholds indicate points of diminishing marginal return from increased training data set sample size measured by the number of sentences, with point estimates ranging from 439 sentences for RoBERTa_large to 527 sentences for GPT-2_large. Likewise, the threshold regression models indicate a diminishing marginal return for EPS with point estimates between 1.36 and 1.38.\u0000 \u0000 \u0000 \u0000 Relatively modest sample sizes can be used to fine-tune LLMs for NER tasks applied to biomedical text, and training data entity density should representatively approximate entity density in production data. Training data quality and a model architecture’s intended use (text generation vs text processing or classification) may be as, or more, important as training data volume and model parameter size.\u0000","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"48 20","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140970862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparison of Personalized and Generalized Approaches to Emotion Recognition Using Consumer Wearable Devices: Machine Learning Study. 使用消费类可穿戴设备进行情感识别的个性化和通用化方法比较:机器学习研究。
JMIR AI Pub Date : 2024-05-10 DOI: 10.2196/52171
Joe Li, Peter Washington
{"title":"A Comparison of Personalized and Generalized Approaches to Emotion Recognition Using Consumer Wearable Devices: Machine Learning Study.","authors":"Joe Li, Peter Washington","doi":"10.2196/52171","DOIUrl":"10.2196/52171","url":null,"abstract":"<p><strong>Background: </strong>There are a wide range of potential adverse health effects, ranging from headaches to cardiovascular disease, associated with long-term negative emotions and chronic stress. Because many indicators of stress are imperceptible to observers, the early detection of stress remains a pressing medical need, as it can enable early intervention. Physiological signals offer a noninvasive method for monitoring affective states and are recorded by a growing number of commercially available wearables.</p><p><strong>Objective: </strong>We aim to study the differences between personalized and generalized machine learning models for 3-class emotion classification (neutral, stress, and amusement) using wearable biosignal data.</p><p><strong>Methods: </strong>We developed a neural network for the 3-class emotion classification problem using data from the Wearable Stress and Affect Detection (WESAD) data set, a multimodal data set with physiological signals from 15 participants. We compared the results between a participant-exclusive generalized, a participant-inclusive generalized, and a personalized deep learning model.</p><p><strong>Results: </strong>For the 3-class classification problem, our personalized model achieved an average accuracy of 95.06% and an F<sub>1</sub>-score of 91.71%; our participant-inclusive generalized model achieved an average accuracy of 66.95% and an F<sub>1</sub>-score of 42.50%; and our participant-exclusive generalized model achieved an average accuracy of 67.65% and an F<sub>1</sub>-score of 43.05%.</p><p><strong>Conclusions: </strong>Our results emphasize the need for increased research in personalized emotion recognition models given that they outperform generalized models in certain contexts. We also demonstrate that personalized machine learning models for emotion classification are viable and can achieve high performance.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e52171"},"PeriodicalIF":0.0,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11127131/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online Health Search Via Multidimensional Information Quality Assessment Based on Deep Language Models: Algorithm Development and Validation. 基于深度语言模型的多维信息质量评估在线健康搜索:算法开发与验证
JMIR AI Pub Date : 2024-05-02 DOI: 10.2196/42630
Boya Zhang, Nona Naderi, Rahul Mishra, Douglas Teodoro
{"title":"Online Health Search Via Multidimensional Information Quality Assessment Based on Deep Language Models: Algorithm Development and Validation.","authors":"Boya Zhang, Nona Naderi, Rahul Mishra, Douglas Teodoro","doi":"10.2196/42630","DOIUrl":"10.2196/42630","url":null,"abstract":"<p><strong>Background: </strong>Widespread misinformation in web resources can lead to serious implications for individuals seeking health advice. Despite that, information retrieval models are often focused only on the query-document relevance dimension to rank results.</p><p><strong>Objective: </strong>We investigate a multidimensional information quality retrieval model based on deep learning to enhance the effectiveness of online health care information search results.</p><p><strong>Methods: </strong>In this study, we simulated online health information search scenarios with a topic set of 32 different health-related inquiries and a corpus containing 1 billion web documents from the April 2019 snapshot of Common Crawl. Using state-of-the-art pretrained language models, we assessed the quality of the retrieved documents according to their usefulness, supportiveness, and credibility dimensions for a given search query on 6030 human-annotated, query-document pairs. We evaluated this approach using transfer learning and more specific domain adaptation techniques.</p><p><strong>Results: </strong>In the transfer learning setting, the usefulness model provided the largest distinction between help- and harm-compatible documents, with a difference of +5.6%, leading to a majority of helpful documents in the top 10 retrieved. The supportiveness model achieved the best harm compatibility (+2.4%), while the combination of usefulness, supportiveness, and credibility models achieved the largest distinction between help- and harm-compatibility on helpful topics (+16.9%). In the domain adaptation setting, the linear combination of different models showed robust performance, with help-harm compatibility above +4.4% for all dimensions and going as high as +6.8%.</p><p><strong>Conclusions: </strong>These results suggest that integrating automatic ranking models created for specific information quality dimensions can increase the effectiveness of health-related information retrieval. Thus, our approach could be used to enhance searches made by individuals seeking online health information.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e42630"},"PeriodicalIF":0.0,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11099810/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the Diagnostic Performance of Symptom Checkers: Clinical Vignette Study. 评估症状检查器的诊断性能:临床小故事研究
JMIR AI Pub Date : 2024-04-29 DOI: 10.2196/46875
Mohammad Hammoud, Shahd Douglas, Mohamad Darmach, Sara Alawneh, Swapnendu Sanyal, Youssef Kanbour
{"title":"Evaluating the Diagnostic Performance of Symptom Checkers: Clinical Vignette Study.","authors":"Mohammad Hammoud, Shahd Douglas, Mohamad Darmach, Sara Alawneh, Swapnendu Sanyal, Youssef Kanbour","doi":"10.2196/46875","DOIUrl":"10.2196/46875","url":null,"abstract":"<p><strong>Background: </strong>Medical self-diagnostic tools (or symptom checkers) are becoming an integral part of digital health and our daily lives, whereby patients are increasingly using them to identify the underlying causes of their symptoms. As such, it is essential to rigorously investigate and comprehensively report the diagnostic performance of symptom checkers using standard clinical and scientific approaches.</p><p><strong>Objective: </strong>This study aims to evaluate and report the accuracies of a few known and new symptom checkers using a standard and transparent methodology, which allows the scientific community to cross-validate and reproduce the reported results, a step much needed in health informatics.</p><p><strong>Methods: </strong>We propose a 4-stage experimentation methodology that capitalizes on the standard clinical vignette approach to evaluate 6 symptom checkers. To this end, we developed and peer-reviewed 400 vignettes, each approved by at least 5 out of 7 independent and experienced primary care physicians. To establish a frame of reference and interpret the results of symptom checkers accordingly, we further compared the best-performing symptom checker against 3 primary care physicians with an average experience of 16.6 (SD 9.42) years. To measure accuracy, we used 7 standard metrics, including M1 as a measure of a symptom checker's or a physician's ability to return a vignette's main diagnosis at the top of their differential list, F<sub>1</sub>-score as a trade-off measure between recall and precision, and Normalized Discounted Cumulative Gain (NDCG) as a measure of a differential list's ranking quality, among others.</p><p><strong>Results: </strong>The diagnostic accuracies of the 6 tested symptom checkers vary significantly. For instance, the differences in the M1, F<sub>1</sub>-score, and NDCG results between the best-performing and worst-performing symptom checkers or ranges were 65.3%, 39.2%, and 74.2%, respectively. The same was observed among the participating human physicians, whereby the M1, F<sub>1</sub>-score, and NDCG ranges were 22.8%, 15.3%, and 21.3%, respectively. When compared against each other, physicians outperformed the best-performing symptom checker by an average of 1.2% using F<sub>1</sub>-score, whereas the best-performing symptom checker outperformed physicians by averages of 10.2% and 25.1% using M1 and NDCG, respectively.</p><p><strong>Conclusions: </strong>The performance variation between symptom checkers is substantial, suggesting that symptom checkers cannot be treated as a single entity. On a different note, the best-performing symptom checker was an artificial intelligence (AI)-based one, shedding light on the promise of AI in improving the diagnostic capabilities of symptom checkers, especially as AI keeps advancing exponentially.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e46875"},"PeriodicalIF":0.0,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11091811/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信