{"title":"基于深度学习的肺标本中文病理报告结构化方法","authors":"Tianzhong Lan, Jingwei Li, Xiuyuan Xu, Chengdi Wang, Zhang Yi, Wei-min Li, Jixiang Guo","doi":"10.1109/ICIST52614.2021.9440626","DOIUrl":null,"url":null,"abstract":"As a kind of electronic reports in text form, the Chinese pathology report of lung specimen contains a large amount of information that is important for clinicians to further analysis and mining. However, various expressions and no fixed format increases the difficulty of extracting and standardizing this information. In this paper, we focus on the extraction of lung lesion locations and the corresponding diagnosis from these reports. And to overcome the difficulties, a structured processing method based on deep learning and the idea of part-of-speech (POS) tagging was proposed. Firstly, the data of lung pathology specimen reports are preprocessed to normalize the medical terms. Secondly, the bidirectional Long Short-Term Memory (Bi-LSTM) neural network is adopted to extract the information of lesion locations and pathological diagnosis from each report. Finally, the obtained information is screened by an information filter method to generate the final structured results. Experimental results on the self-constructed datasets indicated that the proposed method can be beneficial for structuring pathology reports of lung specimen and obtained state-of-the-art results.","PeriodicalId":371599,"journal":{"name":"2021 11th International Conference on Information Science and Technology (ICIST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Deep Learning Based Method for Structuring the Chinese Pathological Reports of Lung Specimen\",\"authors\":\"Tianzhong Lan, Jingwei Li, Xiuyuan Xu, Chengdi Wang, Zhang Yi, Wei-min Li, Jixiang Guo\",\"doi\":\"10.1109/ICIST52614.2021.9440626\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a kind of electronic reports in text form, the Chinese pathology report of lung specimen contains a large amount of information that is important for clinicians to further analysis and mining. However, various expressions and no fixed format increases the difficulty of extracting and standardizing this information. In this paper, we focus on the extraction of lung lesion locations and the corresponding diagnosis from these reports. And to overcome the difficulties, a structured processing method based on deep learning and the idea of part-of-speech (POS) tagging was proposed. Firstly, the data of lung pathology specimen reports are preprocessed to normalize the medical terms. Secondly, the bidirectional Long Short-Term Memory (Bi-LSTM) neural network is adopted to extract the information of lesion locations and pathological diagnosis from each report. Finally, the obtained information is screened by an information filter method to generate the final structured results. Experimental results on the self-constructed datasets indicated that the proposed method can be beneficial for structuring pathology reports of lung specimen and obtained state-of-the-art results.\",\"PeriodicalId\":371599,\"journal\":{\"name\":\"2021 11th International Conference on Information Science and Technology (ICIST)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 11th International Conference on Information Science and Technology (ICIST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIST52614.2021.9440626\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 11th International Conference on Information Science and Technology (ICIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIST52614.2021.9440626","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Deep Learning Based Method for Structuring the Chinese Pathological Reports of Lung Specimen
As a kind of electronic reports in text form, the Chinese pathology report of lung specimen contains a large amount of information that is important for clinicians to further analysis and mining. However, various expressions and no fixed format increases the difficulty of extracting and standardizing this information. In this paper, we focus on the extraction of lung lesion locations and the corresponding diagnosis from these reports. And to overcome the difficulties, a structured processing method based on deep learning and the idea of part-of-speech (POS) tagging was proposed. Firstly, the data of lung pathology specimen reports are preprocessed to normalize the medical terms. Secondly, the bidirectional Long Short-Term Memory (Bi-LSTM) neural network is adopted to extract the information of lesion locations and pathological diagnosis from each report. Finally, the obtained information is screened by an information filter method to generate the final structured results. Experimental results on the self-constructed datasets indicated that the proposed method can be beneficial for structuring pathology reports of lung specimen and obtained state-of-the-art results.