从病理报告中提取信息的癌症登记处的深度迁移学习。

Mohammed Alawad, Shang Gao, John Qiu, Noah Schaefferkoetter, Jacob D Hinkle, Hong-Jun Yoon, J Blair Christian, Xiao-Cheng Wu, Eric B Durbin, Jong Cheol Jeong, Isaac Hands, David Rust, Georgia Tourassi
{"title":"从病理报告中提取信息的癌症登记处的深度迁移学习。","authors":"Mohammed Alawad, Shang Gao, John Qiu, Noah Schaefferkoetter, Jacob D Hinkle, Hong-Jun Yoon, J Blair Christian, Xiao-Cheng Wu, Eric B Durbin, Jong Cheol Jeong, Isaac Hands, David Rust, Georgia Tourassi","doi":"10.1109/bhi.2019.8834586","DOIUrl":null,"url":null,"abstract":"Automated text information extraction from cancer pathology reports is an active area of research to support national cancer surveillance. A well-known challenge is how to develop information extraction tools with robust performance across cancer registries. In this study we investigated whether transfer learning (TL) with a convolutional neural network (CNN) can facilitate cross-registry knowledge sharing. Specifically, we performed a series of experiments to determine whether a CNN trained with single-registry data is capable of transferring knowledge to another registry or whether developing a cross-registry knowledge database produces a more effective and generalizable model. Using data from two cancer registries and primary tumor site and topography as the information extraction task of interest, our study showed that TL results in 6.90% and 17.22% improvement of classification macro F-score over the baseline single-registry models. Detailed analysis illustrated that the observed improvement is evident in the low prevalence classes.","PeriodicalId":72024,"journal":{"name":"... IEEE-EMBS International Conference on Biomedical and Health Informatics. IEEE-EMBS International Conference on Biomedical and Health Informatics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/bhi.2019.8834586","citationCount":"13","resultStr":"{\"title\":\"Deep Transfer Learning Across Cancer Registries for Information Extraction from Pathology Reports.\",\"authors\":\"Mohammed Alawad, Shang Gao, John Qiu, Noah Schaefferkoetter, Jacob D Hinkle, Hong-Jun Yoon, J Blair Christian, Xiao-Cheng Wu, Eric B Durbin, Jong Cheol Jeong, Isaac Hands, David Rust, Georgia Tourassi\",\"doi\":\"10.1109/bhi.2019.8834586\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automated text information extraction from cancer pathology reports is an active area of research to support national cancer surveillance. A well-known challenge is how to develop information extraction tools with robust performance across cancer registries. In this study we investigated whether transfer learning (TL) with a convolutional neural network (CNN) can facilitate cross-registry knowledge sharing. Specifically, we performed a series of experiments to determine whether a CNN trained with single-registry data is capable of transferring knowledge to another registry or whether developing a cross-registry knowledge database produces a more effective and generalizable model. Using data from two cancer registries and primary tumor site and topography as the information extraction task of interest, our study showed that TL results in 6.90% and 17.22% improvement of classification macro F-score over the baseline single-registry models. Detailed analysis illustrated that the observed improvement is evident in the low prevalence classes.\",\"PeriodicalId\":72024,\"journal\":{\"name\":\"... IEEE-EMBS International Conference on Biomedical and Health Informatics. IEEE-EMBS International Conference on Biomedical and Health Informatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/bhi.2019.8834586\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"... IEEE-EMBS International Conference on Biomedical and Health Informatics. IEEE-EMBS International Conference on Biomedical and Health Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/bhi.2019.8834586\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2019/9/12 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"... IEEE-EMBS International Conference on Biomedical and Health Informatics. IEEE-EMBS International Conference on Biomedical and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/bhi.2019.8834586","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2019/9/12 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

摘要

从癌症病理报告中自动提取文本信息是支持国家癌症监测的一个活跃研究领域。一个众所周知的挑战是如何开发跨癌症登记处具有健壮性能的信息提取工具。在本研究中,我们研究了卷积神经网络(CNN)的迁移学习(TL)是否可以促进跨注册表的知识共享。具体来说,我们进行了一系列实验,以确定使用单一注册表数据训练的CNN是否能够将知识转移到另一个注册表,或者开发跨注册表知识数据库是否会产生更有效和可推广的模型。使用来自两个癌症登记处和原发肿瘤部位和地形的数据作为感兴趣的信息提取任务,我们的研究表明,与基线单登记处模型相比,TL的分类宏观f评分提高了6.90%和17.22%。详细的分析表明,观察到的改善在低患病率阶层是明显的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Deep Transfer Learning Across Cancer Registries for Information Extraction from Pathology Reports.

Deep Transfer Learning Across Cancer Registries for Information Extraction from Pathology Reports.

Deep Transfer Learning Across Cancer Registries for Information Extraction from Pathology Reports.
Automated text information extraction from cancer pathology reports is an active area of research to support national cancer surveillance. A well-known challenge is how to develop information extraction tools with robust performance across cancer registries. In this study we investigated whether transfer learning (TL) with a convolutional neural network (CNN) can facilitate cross-registry knowledge sharing. Specifically, we performed a series of experiments to determine whether a CNN trained with single-registry data is capable of transferring knowledge to another registry or whether developing a cross-registry knowledge database produces a more effective and generalizable model. Using data from two cancer registries and primary tumor site and topography as the information extraction task of interest, our study showed that TL results in 6.90% and 17.22% improvement of classification macro F-score over the baseline single-registry models. Detailed analysis illustrated that the observed improvement is evident in the low prevalence classes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信