Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt.

Zhichao Yang, Sunjae Kwon, Zonghai Yao, Hong Yu
{"title":"Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt.","authors":"Zhichao Yang, Sunjae Kwon, Zonghai Yao, Hong Yu","doi":"10.1609/aaai.v37i4.25668","DOIUrl":null,"url":null,"abstract":"<p><p>Automatic International Classification of Diseases (ICD) coding aims to assign multiple ICD codes to a medical note with an average of 3,000+ tokens. This task is challenging due to the high-dimensional space of multi-label assignment (155,000+ ICD code candidates) and the long-tail challenge - Many ICD codes are infrequently assigned yet infrequent ICD codes are important clinically. This study addresses the long-tail challenge by transforming this multi-label classification task into an autoregressive generation task. Specifically, we first introduce a novel pretraining objective to generate free text diagnoses and procedures using the SOAP structure, the medical logic physicians use for note documentation. Second, instead of directly predicting the high dimensional space of ICD codes, our model generates the lower dimension of text descriptions, which then infers ICD codes. Third, we designed a novel prompt template for multi-label classification. We evaluate our Generation with Prompt (GP<sub>soap</sub>) model with the benchmark of all code assignment (MIMIC-III-full) and few shot ICD code assignment evaluation benchmark (MIMIC-III-few). Experiments on MIMIC-III-few show that our model performs with a marco F130.2, which substantially outperforms the previous MIMIC-III-full SOTA model (marco F1 4.3) and the model specifically designed for few/zero shot setting (marco F1 18.7). Finally, we design a novel ensemble learner, a cross-attention reranker with prompts, to integrate previous SOTA and our best few-shot coding predictions. Experiments on MIMIC-III-full show that our ensemble learner substantially improves both macro and micro F1, from 10.4 to 14.6 and from 58.2 to 59.1, respectively.</p>","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"37 4","pages":"5366-5374"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10457101/pdf/nihms-1875188.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/aaai.v37i4.25668","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Automatic International Classification of Diseases (ICD) coding aims to assign multiple ICD codes to a medical note with an average of 3,000+ tokens. This task is challenging due to the high-dimensional space of multi-label assignment (155,000+ ICD code candidates) and the long-tail challenge - Many ICD codes are infrequently assigned yet infrequent ICD codes are important clinically. This study addresses the long-tail challenge by transforming this multi-label classification task into an autoregressive generation task. Specifically, we first introduce a novel pretraining objective to generate free text diagnoses and procedures using the SOAP structure, the medical logic physicians use for note documentation. Second, instead of directly predicting the high dimensional space of ICD codes, our model generates the lower dimension of text descriptions, which then infers ICD codes. Third, we designed a novel prompt template for multi-label classification. We evaluate our Generation with Prompt (GPsoap) model with the benchmark of all code assignment (MIMIC-III-full) and few shot ICD code assignment evaluation benchmark (MIMIC-III-few). Experiments on MIMIC-III-few show that our model performs with a marco F130.2, which substantially outperforms the previous MIMIC-III-full SOTA model (marco F1 4.3) and the model specifically designed for few/zero shot setting (marco F1 18.7). Finally, we design a novel ensemble learner, a cross-attention reranker with prompts, to integrate previous SOTA and our best few-shot coding predictions. Experiments on MIMIC-III-full show that our ensemble learner substantially improves both macro and micro F1, from 10.4 to 14.6 and from 58.2 to 59.1, respectively.

多标签少镜头 ICD 编码作为带提示的自回归生成。
国际疾病分类(ICD)自动编码的目的是为平均包含 3,000 多个标记的医疗记录分配多个 ICD 代码。由于多标签分配的高维空间(155,000 多个候选 ICD 代码)和长尾挑战--许多 ICD 代码不常分配,而不常分配的 ICD 代码在临床上却很重要,因此这项任务极具挑战性。本研究通过将多标签分类任务转化为自回归生成任务来应对长尾挑战。具体来说,我们首先引入了一个新颖的预训练目标,利用 SOAP 结构生成自由文本诊断和手术,SOAP 结构是医生用于记录病历的医疗逻辑。其次,我们的模型不是直接预测 ICD 代码的高维空间,而是生成较低维度的文本描述,然后推断出 ICD 代码。第三,我们为多标签分类设计了一个新颖的提示模板。我们用全部代码分配基准(MIMIC-III-full)和少数 ICD 代码分配评估基准(MIMIC-III-few)评估了我们的 "带提示生成(GPsoap)"模型。在 MIMIC-III-few 上的实验表明,我们的模型性能达到了 Marco F130.2,大大优于之前的 MIMIC-III-full SOTA 模型(marco F1 4.3)和专为少数/零次设置而设计的模型(marco F1 18.7)。最后,我们设计了一个新颖的集合学习器,即带有提示的交叉注意力重新输入器,以整合之前的 SOTA 和我们的最佳少发编码预测。在 MIMIC-III-full 上的实验表明,我们的集合学习器大幅提高了宏观和微观 F1,分别从 10.4 提高到 14.6 和从 58.2 提高到 59.1。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信