Methods of Information in Medicine最新文献_第8页

Breast Cancer Subtypes Classification with Hybrid Machine Learning Model. 基于混合机器学习模型的乳腺癌亚型分类。

IF 1.7 4区医学

Methods of Information in Medicine Pub Date : 2022-09-01 Epub Date: 2022-09-12 DOI: 10.1055/s-0042-1751043

Suvobrata Sarkar, Kalyani Mali

{"title":"Breast Cancer Subtypes Classification with Hybrid Machine Learning Model.","authors":"Suvobrata Sarkar, Kalyani Mali","doi":"10.1055/s-0042-1751043","DOIUrl":"https://doi.org/10.1055/s-0042-1751043","url":null,"abstract":"Background: Breast cancer is the most prevailing heterogeneous disease among females characterized with distinct molecular subtypes and varied clinicopathological features. With the emergence of various artificial intelligence techniques especially machine learning, the breast cancer research has attained new heights in cancer detection and prognosis.Objective: Recent development in computer driven diagnostic system has enabled the clinicians to improve the accuracy in detecting various types of breast tumors. Our study is to develop a computer driven diagnostic system which will enable the clinicians to improve the accuracy in detecting various types of breast tumors.Methods: In this article, we proposed a breast cancer classification model based on the hybridization of machine learning approaches for classifying triple-negative breast cancer and non-triple negative breast cancer patients with clinicopathological features collected from multiple tertiary care hospitals/centers.Results: The results of genetic algorithm and support vector machine (GA-SVM) hybrid model was compared with classics feature selection SVM hybrid models like support vector machine-recursive feature elimination (SVM-RFE), LASSO-SVM, Grid-SVM, and linear SVM. The classification results obtained from GA-SVM hybrid model outperformed the other compared models when applied on two distinct hospital-based datasets of patients investigated with breast cancer in North West of African subcontinent. To validate the predictive model accuracy, 10-fold cross-validation method was applied on all models with the same multicentered datasets. The model performance was evaluated with well-known metrics like mean squared error, logarithmic loss, F1-score, area under the ROC curve, and the precision-recall curve.Conclusion: The hybrid machine learning model can be employed for breast cancer subtypes classification that could help the medical practitioners in better treatment planning and disease outcome.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"68-83"},"PeriodicalIF":1.7,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33463216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reliability of cusp angulation using three-dimensional (3D) digital models：A Preliminary In Vitro Study. 使用三维（3D）数字模型进行牙尖顶成角的可靠性：体外初步研究。

IF 1.7 4区医学

Methods of Information in Medicine Pub Date : 2022-06-03 DOI: 10.1055/a-1868-6555

Xinggang Liu, Xiao-xian Chen

{"title":"Reliability of cusp angulation using three-dimensional (3D) digital models：A Preliminary In Vitro Study.","authors":"Xinggang Liu, Xiao-xian Chen","doi":"10.1055/a-1868-6555","DOIUrl":"https://doi.org/10.1055/a-1868-6555","url":null,"abstract":"Background At present, artificial intelligence (AI) is incrementally used in clinical data analysis and clinical decision-making. Dental cusp angulation provide valuable insight into chewing efficiency and prosthesis safety issues. AI-enable computing cusp angles have potential important value but there is no reliable digital measurement method at present. Objectives To establish a digital method for measuring cusp angles and investigate the inter-rater and intra-rater reliability. Methods Two cusp angles (angle α and angle β) of the first molar were measured on 21 plaster casts using a goniometer, and on their corresponding digital models using PicPick software after scanning with a CEREC Bluecam three-dimensional (3D) intraoral scanner. Means±standard deviations as well as intraclass correlation coefficients (ICCs) and Pearson's correlation coefficients (PCCs) were calculated and paired sample t-test was carried out. Results Angle α was 139.19°±13.86°, angle β was 19.25°±6.86°. A very strong positive correlation between the two methods was found when the examiner was experienced (r>0.914; p<0.05), and no significant difference between the two methods was found using the paired sample t-test (p>0.20). For inter-rater and intra-rater assessments, the PCC and ICC of the cusp angulation using the digital method showed that 15 of 16 values were higher than the corresponding values measured on traditional plaster casts. However, both measurement methods showed weak positive correlation (r<0.501) and significant differences (p=0.00) for repeated measurements of angle β at two different time points by an inexperienced examiner. Conclusionss Cusp angulation using 3D digital models was a clinical option and appeared to improve the reliability of cusp angulation compared with measuring plaster casts using a goniometer. Intra-rater variability was still evident in measuring small cusp angles using the digital model.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":""},"PeriodicalIF":1.7,"publicationDate":"2022-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43806671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Disambiguating Clinical Abbreviations Using a One-Fits-All Classifier Based on Deep Learning Techniques. 基于深度学习技术的一刀切分类器临床缩略语消歧。

IF 1.7 4区医学

Methods of Information in Medicine Pub Date : 2022-06-01 Epub Date: 2022-02-01 DOI: 10.1055/s-0042-1742388

Areej Jaber, Paloma Martínez

{"title":"Disambiguating Clinical Abbreviations Using a One-Fits-All Classifier Based on Deep Learning Techniques.","authors":"Areej Jaber, Paloma Martínez","doi":"10.1055/s-0042-1742388","DOIUrl":"https://doi.org/10.1055/s-0042-1742388","url":null,"abstract":"Background: Abbreviations are considered an essential part of the clinical narrative; they are used not only to save time and space but also to hide serious or incurable illnesses. Misreckoning interpretation of the clinical abbreviations could affect different aspects concerning patients themselves or other services like clinical support systems. There is no consensus in the scientific community to create new abbreviations, making it difficult to understand them. Disambiguate clinical abbreviations aim to predict the exact meaning of the abbreviation based on context, a crucial step in understanding clinical notes.Objectives: Disambiguating clinical abbreviations is an essential task in information extraction from medical texts. Deep contextualized representations models showed promising results in most word sense disambiguation tasks. In this work, we propose a one-fits-all classifier to disambiguate clinical abbreviations with deep contextualized representation from pretrained language models like Bidirectional Encoder Representation from Transformers (BERT).Methods: A set of experiments with different pretrained clinical BERT models were performed to investigate fine-tuning methods on the disambiguation of clinical abbreviations. One-fits-all classifiers were used to improve disambiguating rare clinical abbreviations.Results: One-fits-all classifiers with deep contextualized representations from Bioclinical, BlueBERT, and MS_BERT pretrained models improved the accuracy using the University of Minnesota data set. The model achieved 98.99, 98.75, and 99.13%, respectively. All the models outperform the state-of-the-art in the previous work of around 98.39%, with the best accuracy using the MS_BERT model.Conclusion: Deep contextualized representations via fine-tuning of pretrained language modeling proved its sufficiency on disambiguating clinical abbreviations; it could be robust for rare and unseen abbreviations and has the advantage of avoiding building a separate classifier for each abbreviation. Transfer learning can improve the development of practical abbreviation disambiguation systems.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"e28-e34"},"PeriodicalIF":1.7,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/79/7a/10-1055-s-0042-1742388.PMC9246508.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39741028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Privacy-Preserving Artificial Intelligence Techniques in Biomedicine. 生物医学中的隐私保护人工智能技术。

IF 1.3 4区医学

Methods of Information in Medicine Pub Date : 2022-06-01 Epub Date: 2022-01-21 DOI: 10.1055/s-0041-1740630

Reihaneh Torkzadehmahani, Reza Nasirigerdeh, David B Blumenthal, Tim Kacprowski, Markus List, Julian Matschinske, Julian Spaeth, Nina Kerstin Wenke, Jan Baumbach

{"title":"Privacy-Preserving Artificial Intelligence Techniques in Biomedicine.","authors":"Reihaneh Torkzadehmahani, Reza Nasirigerdeh, David B Blumenthal, Tim Kacprowski, Markus List, Julian Matschinske, Julian Spaeth, Nina Kerstin Wenke, Jan Baumbach","doi":"10.1055/s-0041-1740630","DOIUrl":"10.1055/s-0041-1740630","url":null,"abstract":"Background: Artificial intelligence (AI) has been successfully applied in numerous scientific domains. In biomedicine, AI has already shown tremendous potential, e.g., in the interpretation of next-generation sequencing data and in the design of clinical decision support systems.Objectives: However, training an AI model on sensitive data raises concerns about the privacy of individual participants. For example, summary statistics of a genome-wide association study can be used to determine the presence or absence of an individual in a given dataset. This considerable privacy risk has led to restrictions in accessing genomic and other biomedical data, which is detrimental for collaborative research and impedes scientific progress. Hence, there has been a substantial effort to develop AI methods that can learn from sensitive data while protecting individuals' privacy.Method: This paper provides a structured overview of recent advances in privacy-preserving AI techniques in biomedicine. It places the most important state-of-the-art approaches within a unified taxonomy and discusses their strengths, limitations, and open problems.Conclusion: As the most promising direction, we suggest combining federated machine learning as a more scalable approach with other additional privacy-preserving techniques. This would allow to merge the advantages to provide privacy guarantees in a distributed way for biomedical applications. Nonetheless, more research is necessary as hybrid approaches pose new challenges such as additional network or computation overhead.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"61 S 01","pages":"e12-e27"},"PeriodicalIF":1.3,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/dd/7f/10-1055-s-0041-1740630.PMC9246509.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9246732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Privacy-Preserving Distributed Analytics Platform for Health Care Data. 一种保护隐私的医疗数据分布式分析平台。

IF 1.7 4区医学

Methods of Information in Medicine Pub Date : 2022-06-01 DOI: 10.1055/s-0041-1740564

Sascha Welten, Yongli Mou, Laurenz Neumann, Mehrshad Jaberansary, Yeliz Yediel Ucer, Toralf Kirsten, Stefan Decker, Oya Beyan

{"title":"A Privacy-Preserving Distributed Analytics Platform for Health Care Data.","authors":"Sascha Welten, Yongli Mou, Laurenz Neumann, Mehrshad Jaberansary, Yeliz Yediel Ucer, Toralf Kirsten, Stefan Decker, Oya Beyan","doi":"10.1055/s-0041-1740564","DOIUrl":"https://doi.org/10.1055/s-0041-1740564","url":null,"abstract":"Background: In recent years, data-driven medicine has gained increasing importance in terms of diagnosis, treatment, and research due to the exponential growth of health care data. However, data protection regulations prohibit data centralisation for analysis purposes because of potential privacy risks like the accidental disclosure of data to third parties. Therefore, alternative data usage policies, which comply with present privacy guidelines, are of particular interest.Objective: We aim to enable analyses on sensitive patient data by simultaneously complying with local data protection regulations using an approach called the Personal Health Train (PHT), which is a paradigm that utilises distributed analytics (DA) methods. The main principle of the PHT is that the analytical task is brought to the data provider and the data instances remain in their original location.Methods: In this work, we present our implementation of the PHT paradigm, which preserves the sovereignty and autonomy of the data providers and operates with a limited number of communication channels. We further conduct a DA use case on data stored in three different and distributed data providers.Results: We show that our infrastructure enables the training of data models based on distributed data sources.Conclusion: Our work presents the capabilities of DA infrastructures in the health care sector, which lower the regulatory obstacles of sharing patient data. We further demonstrate its ability to fuel medical science by making distributed data sets available for scientists or health care practitioners.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"61 S 01","pages":"e1-e11"},"PeriodicalIF":1.7,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/66/85/10-1055-s-0041-1740564.PMC9246511.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9253597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Human Versus Machine: How Do We Know Who Is Winning? ROC Analysis for Comparing Human and Machine Performance under Varying Cost-Prevalence Assumptions. 人类与机器:我们如何知道谁是赢家?在不同成本-流行假设下比较人和机器性能的ROC分析。

IF 1.7 4区医学

Methods of Information in Medicine Pub Date : 2022-06-01 Epub Date: 2021-12-31 DOI: 10.1055/s-0041-1740565

Michael Merry, Patricia Jean Riddle, Jim Warren

{"title":"Human Versus Machine: How Do We Know Who Is Winning? ROC Analysis for Comparing Human and Machine Performance under Varying Cost-Prevalence Assumptions.","authors":"Michael Merry, Patricia Jean Riddle, Jim Warren","doi":"10.1055/s-0041-1740565","DOIUrl":"https://doi.org/10.1055/s-0041-1740565","url":null,"abstract":"Background: Receiver operating characteristic (ROC) analysis is commonly used for comparing models and humans; however, the exact analytical techniques vary and some are flawed.Objectives: The aim of the study is to identify common flaws in ROC analysis for human versus model performance, and address them.Methods: We review current use and identify common errors. We also review the ROC analysis literature for more appropriate techniques.Results: We identify concerns in three techniques: (1) using mean human sensitivity and specificity; (2) assuming humans can be approximated by ROCs; and (3) matching sensitivity and specificity. We identify a technique from Provost et al using dominance tables and cost-prevalence gradients that can be adapted to address these concerns.Conclusion: Dominance tables and cost-prevalence gradients provide far greater detail when comparing performances of models and humans, and address common failings in other approaches. This should be the standard method for such analyses moving forward.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"e45-e49"},"PeriodicalIF":1.7,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/b0/e2/10-1055-s-0041-1740565.PMC9246510.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39889098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Security and Privacy in Distributed Health Care Environments 分布式医疗保健环境中的安全和隐私

IF 1.7 4区医学

Methods of Information in Medicine Pub Date : 2022-05-01 DOI: 10.1055/s-0042-1744484

Stephen Flowerday, C. Xenakis

{"title":"Security and Privacy in Distributed Health Care Environments","authors":"Stephen Flowerday, C. Xenakis","doi":"10.1055/s-0042-1744484","DOIUrl":"https://doi.org/10.1055/s-0042-1744484","url":null,"abstract":"There is an increasing demand for distributed health care systems. Nevertheless, distributed health care environments do not come without risks. At the same time that distributed health care systems are growing, so are the cybersecurity threats targeting them. Additionally, the demand for compliance to new regulations increases as these distributed health caresystemshold sensitivepatientdata. Theuseofdata-driven technologies presents a promising opportunity for significant advances in the field toward improved health care for patients and the general public.1,2 Several recent studies have highlighted the importance and the necessity of developing a data-driven approach where patient data are collected, analyzed, and leveraged for medical research purposes with the help of different types of artificial intelligence. To address the privacy-related challenges, novel methods, such as protection of personal health information, ensuring compliance, guaranteeing FAIR information processing, and building of trust, are required. In this issue, newparadigmsandprominent applications are presented for secure, trustworthy, and privacy-preserving data sharing and knowledge representation to address the emerging needs.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"61 1","pages":"1 - 2"},"PeriodicalIF":1.7,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42386958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Predicting Hospital Readmissions from Health Insurance Claims Data: A Modeling Study Targeting Potentially Inappropriate Prescribing. 从健康保险索赔数据预测医院再入院:针对潜在不适当处方的建模研究。

IF 1.7 4区医学

Methods of Information in Medicine Pub Date : 2022-05-01 Epub Date: 2022-02-10 DOI: 10.1055/s-0042-1742671

Alexander Gerharz, Carmen Ruff, Lucas Wirbka, Felicitas Stoll, Walter E Haefeli, Andreas Groll, Andreas D Meid

{"title":"Predicting Hospital Readmissions from Health Insurance Claims Data: A Modeling Study Targeting Potentially Inappropriate Prescribing.","authors":"Alexander Gerharz, Carmen Ruff, Lucas Wirbka, Felicitas Stoll, Walter E Haefeli, Andreas Groll, Andreas D Meid","doi":"10.1055/s-0042-1742671","DOIUrl":"https://doi.org/10.1055/s-0042-1742671","url":null,"abstract":"Background: Numerous prediction models for readmissions are developed from hospital data whose predictor variables are based on specific data fields that are often not transferable to other settings. In contrast, routine data from statutory health insurances (in Germany) are highly standardized, ubiquitously available, and would thus allow for automatic identification of readmission risks.Objectives: To develop and internally validate prediction models for readmissions based on potentially inappropriate prescribing (PIP) in six diseases from routine data.Methods: In a large database of German statutory health insurance claims, we detected disease-specific readmissions after index admissions for acute myocardial infarction (AMI), heart failure (HF), a composite of stroke, transient ischemic attack or atrial fibrillation (S/AF), chronic obstructive pulmonary disease (COPD), type-2 diabetes mellitus (DM), and osteoporosis (OS). PIP at the index admission was determined by the STOPP/START criteria (Screening Tool of Older Persons' Prescriptions/Screening Tool to Alert doctors to the Right Treatment) which were candidate variables in regularized prediction models for specific readmission within 90 days. The risks from disease-specific models were combined (\"stacked\") to predict all-cause readmission within 90 days. Validation performance was measured by the c-statistics.Results: While the prevalence of START criteria was higher than for STOPP criteria, more single STOPP criteria were selected into models for specific readmissions. Performance in validation samples was the highest for DM (c-statistics: 0.68 [95% confidence interval (CI): 0.66-0.70]), followed by COPD (c-statistics: 0.65 [95% CI: 0.64-0.67]), S/AF (c-statistics: 0.65 [95% CI: 0.63-0.66]), HF (c-statistics: 0.61 [95% CI: 0.60-0.62]), AMI (c-statistics: 0.58 [95% CI: 0.56-0.60]), and OS (c-statistics: 0.51 [95% CI: 0.47-0.56]). Integrating risks from disease-specific models to a combined model for all-cause readmission yielded a c-statistics of 0.63 [95% CI: 0.63-0.64].Conclusion: PIP successfully predicted readmissions for most diseases, opening the possibility for interventions to improve these modifiable risk factors. Machine-learning methods appear promising for future modeling of PIP predictors in complex older patients with many underlying diseases.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"55-60"},"PeriodicalIF":1.7,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39907196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Ambiguous and Incomplete: Natural Language Processing Reveals Problematic Reporting Styles in Thyroid Ultrasound Reports. 模糊和不完整:自然语言处理揭示了甲状腺超声报告中有问题的报告风格。

IF 1.7 4区医学

Methods of Information in Medicine Pub Date : 2022-05-01 Epub Date: 2022-01-06 DOI: 10.1055/s-0041-1740493

Priya H Dedhia, Kallie Chen, Yiqiang Song, Eric LaRose, Joseph R Imbus, Peggy L Peissig, Eneida A Mendonca, David F Schneider

{"title":"Ambiguous and Incomplete: Natural Language Processing Reveals Problematic Reporting Styles in Thyroid Ultrasound Reports.","authors":"Priya H Dedhia, Kallie Chen, Yiqiang Song, Eric LaRose, Joseph R Imbus, Peggy L Peissig, Eneida A Mendonca, David F Schneider","doi":"10.1055/s-0041-1740493","DOIUrl":"https://doi.org/10.1055/s-0041-1740493","url":null,"abstract":"Objective: Natural language processing (NLP) systems convert unstructured text into analyzable data. Here, we describe the performance measures of NLP to capture granular details on nodules from thyroid ultrasound (US) reports and reveal critical issues with reporting language.Methods: We iteratively developed NLP tools using clinical Text Analysis and Knowledge Extraction System (cTAKES) and thyroid US reports from 2007 to 2013. We incorporated nine nodule features for NLP extraction. Next, we evaluated the precision, recall, and accuracy of our NLP tools using a separate set of US reports from an academic medical center (A) and a regional health care system (B) during the same period. Two physicians manually annotated each test-set report. A third physician then adjudicated discrepancies. The adjudicated \"gold standard\" was then used to evaluate NLP performance on the test-set.Results: A total of 243 thyroid US reports contained 6,405 data elements. Inter-annotator agreement for all elements was 91.3%. Compared with the gold standard, overall recall of the NLP tool was 90%. NLP recall for thyroid lobe or isthmus characteristics was: laterality 96% and size 95%. NLP accuracy for nodule characteristics was: laterality 92%, size 92%, calcifications 76%, vascularity 65%, echogenicity 62%, contents 76%, and borders 40%. NLP recall for presence or absence of lymphadenopathy was 61%. Reporting style accounted for 18% errors. For example, the word \"heterogeneous\" interchangeably referred to nodule contents or echogenicity. While nodule dimensions and laterality were often described, US reports only described contents, echogenicity, vascularity, calcifications, borders, and lymphadenopathy, 46, 41, 17, 15, 9, and 41% of the time, respectively. Most nodule characteristics were equally likely to be described at hospital A compared with hospital B.Conclusions: NLP can automate extraction of critical information from thyroid US reports. However, ambiguous and incomplete reporting language hinders performance of NLP systems regardless of institutional setting. Standardized or synoptic thyroid US reports could improve NLP performance.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":" ","pages":"11-18"},"PeriodicalIF":1.7,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39667806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A Comparison of Methods to Detect Changes in Prediction Models. 预测模型变化检测方法的比较。

IF 1.7 4区医学

Methods of Information in Medicine Pub Date : 2022-05-01 DOI: 10.1055/s-0042-1742672

Erin M Schnellinger, Wei Yang, Michael O Harhay, Stephen E Kimmel

{"title":"A Comparison of Methods to Detect Changes in Prediction Models.","authors":"Erin M Schnellinger, Wei Yang, Michael O Harhay, Stephen E Kimmel","doi":"10.1055/s-0042-1742672","DOIUrl":"https://doi.org/10.1055/s-0042-1742672","url":null,"abstract":"Background: Prediction models inform decisions in many areas of medicine. Most models are fitted once and then applied to new (future) patients, despite the fact that model coefficients can vary over time due to changes in patients' clinical characteristics and disease risk. However, the optimal method to detect changes in model parameters has not been rigorously assessed.Methods: We simulated data, informed by post-lung transplant mortality data and tested the following two approaches for detecting model change: (1) the \"Direct Approach,\" it compares coefficients of the model refit on recent data to those at baseline; and (2) \"Calibration Regression,\" it fits a logistic regression model of the log-odds of the observed outcomes versus the linear predictor from the baseline model (i.e., the log-odds of the predicted probabilities obtained from the baseline model) and tests whether the intercept and slope differ from 0 and 1, respectively. Four scenarios were simulated using logistic regression for binary outcomes as follows: (1) we fixed all model parameters, (2) we varied the outcome prevalence between 0.1 and 0.2, (3) we varied the coefficient of one of the ten predictors between 0.2 and 0.4, and (4) we varied the outcome prevalence and coefficient of one predictor simultaneously.Results: Calibration regression tended to detect changes sooner than the Direct Approach, with better performance (e.g., larger proportion of true claims). When the sample size was large, both methods performed well. When two parameters changed simultaneously, neither method performed well.Conclusion: Neither change detection method examined here proved optimal under all circumstances. However, our results suggest that if one is interested in detecting a change in overall incidence of an outcome (e.g., intercept), the Calibration Regression method may be superior to the Direct Approach. Conversely, if one is interested in detecting a change in other model covariates (e.g., slope), the Direct Approach may be superior.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"61 1-02","pages":"19-28"},"PeriodicalIF":1.7,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10413959/pdf/nihms-1887521.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9976306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2