{"title":"Construction and Application of an Information Closed-Loop Management System for Maternal and Neonatal Access and Exit Rooms: Non Randomized Controlled Trial.","authors":"Shafeng Jia, Naifeng Zhu, Jia Liu, Niankai Cheng, Ling Jiang, Jing Yang","doi":"10.2196/66451","DOIUrl":"https://doi.org/10.2196/66451","url":null,"abstract":"<p><strong>Background: </strong>Traditional management methods can no longer meet the demand for efficient and accurate neonatal care. There is a need for an information-based and intelligent management system.</p><p><strong>Objective: </strong>This study aimed to construct an information closed-loop management system to improve the accuracy of identification in mother-infant rooming-in care units and enhance the efficiency of infant admission and discharge management.</p><p><strong>Methods: </strong>Mothers who delivered between January 2023 and June 2023 were assigned to the control group (n=200), while those who delivered between July 2023 and May 2024 were assigned to the research group (n=200). The control group adopted traditional management methods, whereas the research group implemented closed-loop management. Barcode technology, a wireless network, mobile terminals, and other information technology equipments were used to complete the closed loop of newborn exit and entry management. Data on the satisfaction of mothers and their families, the monthly average qualification rate of infant identity verification, and the qualification rate of infant consultation time were collected and statistically analyzed before and after the closed-loop process was implemented.</p><p><strong>Results: </strong>After the closed-loop process was implemented, the monthly average qualification rate of infant identity verification increased to 99.45 (SD 1.34), significantly higher than the control group before implementation 83.58 (SD 1.92) (P=.02). The satisfaction of mothers and their families was 96.45 (SD 3.32), higher than that of the control group before the closed-loop process was implemented 92.82 (SD 4.73) (P=.01). Additionally, the separation time between infants and mothers was restricted to under 1 hour.</p><p><strong>Conclusions: </strong>The construction and application of the information closed-loop management system significantly improved the accuracy and efficiency of maternal and infant identity verification, enhancing the safety of newborns.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e66451"},"PeriodicalIF":3.1,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143804842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Expression of Concern: A Dynamic Adaptive Ensemble Learning Framework for Noninvasive Mild Cognitive Impairment Detection: Development and Validation Study.","authors":"","doi":"10.2196/75352","DOIUrl":"10.2196/75352","url":null,"abstract":"","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e75352"},"PeriodicalIF":3.1,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143789394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brian C Coleman, Kelsey L Corcoran, Cynthia A Brandt, Joseph L Goulet, Stephen L Luther, Anthony J Lisi
{"title":"Identifying Patient-Reported Outcome Measure Documentation in Veterans Health Administration Chiropractic Clinic Notes: Natural Language Processing Analysis.","authors":"Brian C Coleman, Kelsey L Corcoran, Cynthia A Brandt, Joseph L Goulet, Stephen L Luther, Anthony J Lisi","doi":"10.2196/66466","DOIUrl":"10.2196/66466","url":null,"abstract":"<p><strong>Background: </strong>The use of patient-reported outcome measures (PROMs) is an expected component of high-quality, measurement-based chiropractic care. The largest health care system offering integrated chiropractic care is the Veterans Health Administration (VHA). Challenges limit monitoring PROM use as a care quality metric at a national scale in the VHA. Structured data are unavailable, with PROMs often embedded within clinic text notes as unstructured data requiring time-intensive, peer-conducted chart review for evaluation. Natural language processing (NLP) of clinic text notes is one promising solution to extracting care quality data from unstructured text.</p><p><strong>Objective: </strong>This study aims to test NLP approaches to identify PROMs documented in VHA chiropractic text notes.</p><p><strong>Methods: </strong>VHA chiropractic notes from October 1, 2017, to September 30, 2020, were obtained from the VHA Musculoskeletal Diagnosis/Complementary and Integrative Health Cohort. A rule-based NLP model built using medspaCy and spaCy was evaluated on text matching and note categorization tasks. SpaCy was used to build bag-of-words, convoluted neural networks, and ensemble models for note categorization. Performance metrics for each model and task included precision, recall, and F-measure. Cross-validation was used to validate performance metric estimates for the statistical and machine-learning models.</p><p><strong>Results: </strong>Our sample included 377,213 visit notes from 56,628 patients. The rule-based model performance was good for soft-boundary text-matching (precision=81.1%, recall=96.7%, and F-measure=88.2%) and excellent for note categorization (precision=90.3%, recall=99.5%, and F-measure=94.7%). Cross-validation performance of the statistical and machine learning models for the note categorization task was very good overall, but lower than rule-based model performance. The overall prevalence of PROM documentation was low (17.0%).</p><p><strong>Conclusions: </strong>We evaluated multiple NLP methods across a series of tasks, with optimal performance achieved using a rule-based method. By leveraging NLP approaches, we can overcome the challenges posed by unstructured clinical text notes to track documented PROM use. Overall documented use of PROMs in chiropractic notes was low and highlights a potential for quality improvement. This work represents a methodological advancement in the identification and monitoring of documented use of PROMs to ensure consistent, high-quality chiropractic care for veterans.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e66466"},"PeriodicalIF":3.1,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143775023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Viljami Männikkö, Janne Tommola, Emmi Tikkanen, Olli-Pekka Hätinen, Fredrik Åberg
{"title":"Large-Scale Evaluation and Liver Disease Risk Prediction in Finland's National Electronic Health Record System: Feasibility Study Using Real-World Data.","authors":"Viljami Männikkö, Janne Tommola, Emmi Tikkanen, Olli-Pekka Hätinen, Fredrik Åberg","doi":"10.2196/62978","DOIUrl":"10.2196/62978","url":null,"abstract":"<p><strong>Background: </strong>Globally, the incidence and mortality of chronic liver disease are escalating. Early detection of liver disease remains a challenge, often occurring at symptomatic stages when preventative measures are less effective. The Chronic Liver Disease score (CLivD) is a predictive risk model developed using Finnish health care data, aiming to forecast an individual's risk of developing chronic liver disease in subsequent years. The Kanta Service is a national electronic health record system in Finland that stores comprehensive health care data including patient medical histories, prescriptions, and laboratory results, to facilitate health care delivery and research.</p><p><strong>Objective: </strong>This study aimed to evaluate the feasibility of implementing an automatic CLivD score with the current Kanta platform and identify and suggest improvements for Kanta that would enable accurate automatic risk detection.</p><p><strong>Methods: </strong>In this study, a real-world data repository (Kanta) was used as a data source for \"The ClivD score\" risk calculation model. Our dataset consisted of 96,200 individuals' whole medical history from Kanta. For real-world data use, we designed processes to handle missing input in the calculation process.</p><p><strong>Results: </strong>We found that Kanta currently lacks many CLivD risk model input parameters in the structured format required to calculate precise risk scores. However, the risk scores can be improved by using the unstructured text in patient reports and by approximating variables by using other health data-like diagnosis information. Using structured data, we were able to identify only 33 out of 51,275 individuals in the \"low risk\" category and 308 out of 51,275 individuals (<1%) in the \"moderate risk\" category. By adding diagnosis information approximation and free text use, we were able to identify 18,895 out of 51,275 (37%) individuals in the \"low risk\" category and 2125 out of 51,275 (4%) individuals in the \"moderate risk\" category. In both cases, we were not able to identify any individuals in the \"high-risk\" category because of the missing waist-hip ratio measurement. We evaluated 3 scenarios to improve the coverage of waist-hip ratio data in Kanta and these yielded the most substantial improvement in prediction accuracy.</p><p><strong>Conclusions: </strong>We conclude that the current structured Kanta data is not enough for precise risk calculation for CLivD or other diseases where obesity, smoking, and alcohol use are important risk factors. Our simulations show up to 14% improvement in risk detection when adding support for missing input variables. Kanta shows the potential for implementing nationwide automated risk detection models that could result in improved disease prevention and public health.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e62978"},"PeriodicalIF":3.1,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143765901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
James Skoric, Anna M Lomanowska, Tahir Janmohamed, Heather Lumsden-Ruegg, Joel Katz, Hance Clarke, Quazi Abidur Rahman
{"title":"Predicting Clinical Outcomes at the Toronto General Hospital Transitional Pain Service via the Manage My Pain App: Machine Learning Approach.","authors":"James Skoric, Anna M Lomanowska, Tahir Janmohamed, Heather Lumsden-Ruegg, Joel Katz, Hance Clarke, Quazi Abidur Rahman","doi":"10.2196/67178","DOIUrl":"10.2196/67178","url":null,"abstract":"<p><strong>Background: </strong>Chronic pain is a complex condition that affects more than a quarter of people worldwide. The development and progression of chronic pain are unique to each individual due to the contribution of interacting biological, psychological, and social factors. The subjective nature of the experience of chronic pain can make its clinical assessment and prognosis challenging. Personalized digital health apps, such as Manage My Pain (MMP), are popular pain self-tracking tools that can also be leveraged by clinicians to support patients. Recent advances in machine learning technologies open an opportunity to use data collected in pain apps to make predictions about a patient's prognosis.</p><p><strong>Objective: </strong>This study applies machine learning methods using real-world user data from the MMP app to predict clinically significant improvements in pain-related outcomes among patients at the Toronto General Hospital Transitional Pain Service.</p><p><strong>Methods: </strong>Information entered into the MMP app by 160 Transitional Pain Service patients over a 1-month period, including profile information, pain records, daily reflections, and clinical questionnaire responses, was used to extract 245 relevant variables, referred to as features, for use in a machine learning model. The machine learning model was developed using logistic regression with recursive feature elimination to predict clinically significant improvements in pain-related pain interference, assessed by the PROMIS Pain Interference 8a v1.0 questionnaire. The model was tuned and the important features were selected using the 10-fold cross-validation method. Leave-one-out cross-validation was used to test the model's performance.</p><p><strong>Results: </strong>The model predicted patient improvement in pain interference with 79% accuracy and an area under the receiver operating characteristic curve of 0.82. It showed balanced class accuracies between improved and nonimproved patients, with a sensitivity of 0.76 and a specificity of 0.82. Feature importance analysis indicated that all MMP app data, not just clinical questionnaire responses, were key to classifying patient improvement.</p><p><strong>Conclusions: </strong>This study demonstrates that data from a digital health app can be integrated with clinical questionnaire responses in a machine learning model to effectively predict which chronic pain patients will show clinically significant improvement. The findings emphasize the potential of machine learning methods in real-world clinical settings to improve personalized treatment plans and patient outcomes.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e67178"},"PeriodicalIF":3.1,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11970568/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143736196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated Radiology Report Labeling in Chest X-Ray Pathologies: Development and Evaluation of a Large Language Model Framework.","authors":"Abdullah Abdullah, Seong Tae Kim","doi":"10.2196/68618","DOIUrl":"10.2196/68618","url":null,"abstract":"<p><strong>Background: </strong>Labeling unstructured radiology reports is crucial for creating structured datasets that facilitate downstream tasks, such as training large-scale medical imaging models. Current approaches typically rely on Bidirectional Encoder Representations from Transformers (BERT)-based methods or manual expert annotations, which have limitations in terms of scalability and performance.</p><p><strong>Objective: </strong>This study aimed to evaluate the effectiveness of a generative pretrained transformer (GPT)-based large language model (LLM) in labeling radiology reports, comparing it with 2 existing methods, CheXbert and CheXpert, on a large chest X-ray dataset (MIMIC Chest X-ray [MIMIC-CXR]).</p><p><strong>Methods: </strong>In this study, we introduce an LLM-based approach fine-tuned on expert-labeled radiology reports. Our model's performance was evaluated on 687 radiologist-labeled chest X-ray reports, comparing F1 scores across 14 thoracic pathologies. The performance of our LLM model was compared with the CheXbert and CheXpert models across positive, negative, and uncertainty extraction tasks. Paired t tests and Wilcoxon signed-rank tests were performed to evaluate the statistical significance of differences between model performances.</p><p><strong>Results: </strong>The GPT-based LLM model achieved an average F1 score of 0.9014 across all certainty levels, outperforming CheXpert (0.8864) and approaching CheXbert's performance (0.9047). For positive and negative certainty levels, our model scored 0.8708, surpassing CheXpert (0.8525) and closely matching CheXbert (0.8733). Statistically, paired t tests indicated no significant difference between our model and CheXbert (P=.35) but a significant improvement over CheXpert (P=.01). Wilcoxon signed-rank tests corroborated these findings, showing no significant difference between our model and CheXbert (P=.14) but confirming a significant difference with CheXpert (P=.005). The LLM also demonstrated superior performance for pathologies with longer and more complex descriptions, leveraging its extended context length.</p><p><strong>Conclusions: </strong>The GPT-based LLM model demonstrates competitive performance compared with CheXbert and outperforms CheXpert in radiology report labeling. These findings suggest that LLMs are a promising alternative to traditional BERT-based architectures for this task, offering enhanced context understanding and eliminating the need for extensive feature engineering. Furthermore, with large context length LLM-based models are better suited for this task as compared with the small context length of BERT based models.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e68618"},"PeriodicalIF":3.1,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11970564/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143736195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jose Miguel Acitores Cortina, Yasaman Fatapour, Kathleen LaRow Brown, Undina Gisladottir, Michael Zietz, Oliver John Bear Don't Walk Iv, Danner Peter, Jacob S Berkowitz, Nadine A Friedrich, Sophia Kivelson, Aditi Kuchi, Hongyu Liu, Apoorva Srinivasan, Kevin K Tsang, Nicholas P Tatonetti
{"title":"Biases in Race and Ethnicity Introduced by Filtering Electronic Health Records for \"Complete Data\": Observational Clinical Data Analysis.","authors":"Jose Miguel Acitores Cortina, Yasaman Fatapour, Kathleen LaRow Brown, Undina Gisladottir, Michael Zietz, Oliver John Bear Don't Walk Iv, Danner Peter, Jacob S Berkowitz, Nadine A Friedrich, Sophia Kivelson, Aditi Kuchi, Hongyu Liu, Apoorva Srinivasan, Kevin K Tsang, Nicholas P Tatonetti","doi":"10.2196/67591","DOIUrl":"10.2196/67591","url":null,"abstract":"<p><strong>Background: </strong>Integrated clinical databases from national biobanks have advanced the capacity for disease research. Data quality and completeness filters are used when building clinical cohorts to address limitations of data missingness. However, these filters may unintentionally introduce systemic biases when they are correlated with race and ethnicity.</p><p><strong>Objective: </strong>In this study, we examined the race and ethnicity biases introduced by applying common filters to 4 clinical records databases. Specifically, we evaluated whether these filters introduce biases that disproportionately exclude minoritized groups.</p><p><strong>Methods: </strong>We applied 19 commonly used data filters to electronic health record datasets from 4 geographically varied locations comprising close to 12 million patients to understand how using these filters introduces sample bias along racial and ethnic groupings. These filters covered a range of information, including demographics, medication records, visit details, and observation periods. We observed the variation in sample drop-off between self-reported ethnic and racial groups for each site as we applied each filter individually.</p><p><strong>Results: </strong>Applying the observation period filter substantially reduced data availability across all races and ethnicities in all 4 datasets. However, among those examined, the availability of data in the white group remained consistently higher compared to other racial groups after applying each filter. Conversely, the Black or African American group was the most impacted by each filter on these 3 datasets: Cedars-Sinai dataset, UK Biobank, and Columbia University dataset. Among the 4 distinct datasets, only applying the filters to the All of Us dataset resulted in minimal deviation from the baseline, with most racial and ethnic groups following a similar pattern.</p><p><strong>Conclusions: </strong>Our findings underscore the importance of using only necessary filters, as they might disproportionally affect data availability of minoritized racial and ethnic populations. Researchers must consider these unintentional biases when performing data-driven research and explore techniques to minimize the impact of these filters, such as probabilistic methods or adjusted cohort selection methods. Additionally, we recommend disclosing sample sizes for racial and ethnic groups both before and after data filters are applied to aid the reader in understanding the generalizability of the results. Future work should focus on exploring the effects of filters on downstream analyses.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e67591"},"PeriodicalIF":3.1,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11967746/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143733281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Susan J Oudbier, Ellen M A Smets, Pythia T Nieuwkerk, David P Neal, S Azam Nurmohamed, Hans J Meij, Linda W Dusseljee-Peute
{"title":"Correction: Patients' Experienced Usability and Satisfaction With Digital Health Solutions in a Home Setting: Instrument Validation Study.","authors":"Susan J Oudbier, Ellen M A Smets, Pythia T Nieuwkerk, David P Neal, S Azam Nurmohamed, Hans J Meij, Linda W Dusseljee-Peute","doi":"10.2196/73416","DOIUrl":"10.2196/73416","url":null,"abstract":"","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e73416"},"PeriodicalIF":3.1,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11968001/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143733282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Systematic Review Updates With Natural Language Processing Through Abstract Component Classification and Selection: Algorithm Development and Validation.","authors":"Tatsuki Hasegawa, Hayato Kizaki, Keisho Ikegami, Shungo Imai, Yuki Yanagisawa, Shuntaro Yada, Eiji Aramaki, Satoko Hori","doi":"10.2196/65371","DOIUrl":"https://doi.org/10.2196/65371","url":null,"abstract":"<p><strong>Background: </strong>A challenge in updating systematic reviews is the workload in screening the articles. Many screening models using natural language processing technology have been implemented to scrutinize articles based on titles and abstracts. While these approaches show promise, traditional models typically treat abstracts as uniform text. We hypothesize that selective training on specific abstract components could enhance model performance for systematic review screening.</p><p><strong>Objective: </strong>We evaluated the efficacy of a novel screening model that selects specific components from abstracts to improve performance and developed an automatic systematic review update model using an abstract component classifier to categorize abstracts based on their components.</p><p><strong>Methods: </strong>A screening model was created based on the included and excluded articles in the existing systematic review and used as the scheme for the automatic update of the systematic review. A prior publication was selected for the systematic review, and articles included or excluded in the articles screening process were used as training data. The titles and abstracts were classified into 5 categories (Title, Introduction, Methods, Results, and Conclusion). Thirty-one component-composition datasets were created by combining 5 component datasets. We implemented 31 screening models using the component-composition datasets and compared their performances. Comparisons were conducted using 3 pretrained models: Bidirectional Encoder Representations from Transformer (BERT), BioLinkBERT, and BioM- Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA). Moreover, to automate the component selection of abstracts, we developed the Abstract Component Classifier Model and created component datasets using this classifier model classification. Using the component datasets classified using the Abstract Component Classifier Model, we created 10 component-composition datasets used by the top 10 screening models with the highest performance when implementing screening models using the component datasets that were classified manually. Ten screening models were implemented using these datasets, and their performances were compared with those of models developed using manually classified component-composition datasets. The primary evaluation metric was the F10-Score weighted by the recall.</p><p><strong>Results: </strong>A total of 256 included articles and 1261 excluded articles were extracted from the selected systematic review. In the screening models implemented using manually classified datasets, the performance of some surpassed that of models trained on all components (BERT: 9 models, BioLinkBERT: 6 models, and BioM-ELECTRA: 21 models). In models implemented using datasets classified by the Abstract Component Classifier Model, the performances of some models (BERT: 7 models and BioM-ELECTRA: 9 models) surpassed that of","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e65371"},"PeriodicalIF":3.1,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143733283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anthony Kelly, Esben Kjems Jensen, Eoin Martino Grua, Kim Mathiasen, Pepijn Van de Ven
{"title":"An Interpretable Model With Probabilistic Integrated Scoring for Mental Health Treatment Prediction: Design Study.","authors":"Anthony Kelly, Esben Kjems Jensen, Eoin Martino Grua, Kim Mathiasen, Pepijn Van de Ven","doi":"10.2196/64617","DOIUrl":"https://doi.org/10.2196/64617","url":null,"abstract":"<p><strong>Background: </strong>Machine learning (ML) systems in health care have the potential to enhance decision-making but often fail to address critical issues such as prediction explainability, confidence, and robustness in a context-based and easily interpretable manner.</p><p><strong>Objective: </strong>This study aimed to design and evaluate an ML model for a future decision support system for clinical psychopathological treatment assessments. The novel ML model is inherently interpretable and transparent. It aims to enhance clinical explainability and trust through a transparent, hierarchical model structure that progresses from questions to scores to classification predictions. The model confidence and robustness were addressed by applying Monte Carlo dropout, a probabilistic method that reveals model uncertainty and confidence.</p><p><strong>Methods: </strong>A model for clinical psychopathological treatment assessments was developed, incorporating a novel ML model structure. The model aimed at enhancing the graphical interpretation of the model outputs and addressing issues of prediction explainability, confidence, and robustness. The proposed ML model was trained and validated using patient questionnaire answers and demographics from a web-based treatment service in Denmark (N=1088).</p><p><strong>Results: </strong>The balanced accuracy score on the test set was 0.79. The precision was ≥0.71 for all 4 prediction classes (depression, panic, social phobia, and specific phobia). The area under the curve for the 4 classes was 0.93, 0.92, 0.91, and 0.98, respectively.</p><p><strong>Conclusions: </strong>We have demonstrated a mental health treatment ML model that supported a graphical interpretation of prediction class probability distributions. Their spread and overlap can inform clinicians of competing treatment possibilities for patients and uncertainty in treatment predictions. With the ML model achieving 79% balanced accuracy, we expect that the model will be clinically useful in both screening new patients and informing clinical interviews.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e64617"},"PeriodicalIF":3.1,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143733280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}