High-performing Multi-task Model of Urinary Tract Dilation (UTD) Classification for Neonatal Ultrasound Reports Through Natural Language Processing

medRxiv - Urology Pub Date : 2024-01-24 DOI:10.1101/2024.01.23.24301680

Yining Hua, Anudeep Mukkamala, Carlos Estrada, Michael Lingzhi Li, Hsin-Hsiao Wang

{"title":"High-performing Multi-task Model of Urinary Tract Dilation (UTD) Classification for Neonatal Ultrasound Reports Through Natural Language Processing","authors":"Yining Hua, Anudeep Mukkamala, Carlos Estrada, Michael Lingzhi Li, Hsin-Hsiao Wang","doi":"10.1101/2024.01.23.24301680","DOIUrl":null,"url":null,"abstract":"Objective: The urinary tract dilation (UTD) classification system provides objective assessment relevant to hydronephrosis management for children. However, the lack of uniform language regarding UTD in radiology reports leads to significant difficulty in both clinical management and research. We seek to develop a unified multi-task/multi-class model that can effectively extract UTD components and classifications from early postnatal ultrasound (US) reports.\nMethods: Radiology records from our institution were reviewed to identify infants aged 0-90 days undergoing early ultrasound for antenatal UTD. The report and images were reviewed by the study team to create the ground truth of UTD classification and components (primary outcome). Bio_ClinicalBERT, a variant of the Bidirectional Encoder Representations from Transformers (BERT) model, was used as the embedding layers of the classification model. The model was fine-tuned with 11 linear classification layers. All but the last BERT layer were frozen during the fine-tuning process. The model performance was evaluated with five-fold cross-validation with an 80:20 train-test ratio.\nResults: 2460 early (0-90 days) US reports were included. The five-fold cross-validated model performance is satisfactory (Weighted F1 > 0.9 for all UTD components). We report the weighted F1 scores, accuracies, and standard deviations for all 11 tasks and their average performance. Conclusions: By applying deep state-of-the-art NLP neural networks, we developed a high-performing, efficient, and scalable solution to extract UTD components from unstructured ultrasound reports using one single multi-task model. This can potentially help standardize and facilitate large-scale computer vision research for pediatric hydronephrosis. Key Words: machine learning, efficiency, ambulatory care, forecasting","PeriodicalId":501140,"journal":{"name":"medRxiv - Urology","volume":"10 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Urology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.01.23.24301680","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: The urinary tract dilation (UTD) classification system provides objective assessment relevant to hydronephrosis management for children. However, the lack of uniform language regarding UTD in radiology reports leads to significant difficulty in both clinical management and research. We seek to develop a unified multi-task/multi-class model that can effectively extract UTD components and classifications from early postnatal ultrasound (US) reports. Methods: Radiology records from our institution were reviewed to identify infants aged 0-90 days undergoing early ultrasound for antenatal UTD. The report and images were reviewed by the study team to create the ground truth of UTD classification and components (primary outcome). Bio_ClinicalBERT, a variant of the Bidirectional Encoder Representations from Transformers (BERT) model, was used as the embedding layers of the classification model. The model was fine-tuned with 11 linear classification layers. All but the last BERT layer were frozen during the fine-tuning process. The model performance was evaluated with five-fold cross-validation with an 80:20 train-test ratio. Results: 2460 early (0-90 days) US reports were included. The five-fold cross-validated model performance is satisfactory (Weighted F1 > 0.9 for all UTD components). We report the weighted F1 scores, accuracies, and standard deviations for all 11 tasks and their average performance. Conclusions: By applying deep state-of-the-art NLP neural networks, we developed a high-performing, efficient, and scalable solution to extract UTD components from unstructured ultrasound reports using one single multi-task model. This can potentially help standardize and facilitate large-scale computer vision research for pediatric hydronephrosis. Key Words: machine learning, efficiency, ambulatory care, forecasting

查看原文本刊更多论文

通过自然语言处理对新生儿超声报告进行尿路扩张 (UTD) 分类的高性能多任务模型

目的：尿路扩张（UTD）分类系统提供了与儿童肾积水治疗相关的客观评估。然而，放射学报告中缺乏有关UTD的统一语言，给临床管理和研究带来了很大困难。我们试图开发一种统一的多任务/多类别模型，它能有效地从早期产后超声（US）报告中提取UTD成分和分类：方法：我们查阅了本院的放射科记录，以确定接受产前UTD早期超声检查的0-90天婴儿。研究小组对报告和图像进行审查，以创建UTD分类和组成部分（主要结果）的基本事实。Bio_ClinicalBERT 是变压器双向编码器表征（BERT）模型的变体，被用作分类模型的嵌入层。该模型通过 11 个线性分类层进行了微调。在微调过程中，除最后一个 BERT 层外，其他层均被冻结。模型性能通过五倍交叉验证进行评估，训练-测试比例为 80:20。经五倍交叉验证的模型性能令人满意（所有UTD成分的加权F1均为0.9）。我们报告了所有 11 项任务的加权 F1 分数、准确率和标准偏差及其平均性能。结论通过应用最先进的深度 NLP 神经网络，我们开发出了一种高性能、高效率、可扩展的解决方案，使用单一多任务模型从非结构化超声报告中提取 UTD 成分。这可能有助于小儿肾积水的大规模计算机视觉研究的标准化和便利化。关键字：机器学习、效率、门诊护理、预测

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

medRxiv - Urology

自引率

0.00%

发文量