A lattice-transformer-graph deep learning model for Chinese named entity recognition

IF 2.1 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Journal of Intelligent Systems Pub Date : 2023-01-01 DOI:10.1515/jisys-2022-2014

Min Lin, Yanyan Xu, Chenghao Cai, Dengfeng Ke, Kaile Su

{"title":"A lattice-transformer-graph deep learning model for Chinese named entity recognition","authors":"Min Lin, Yanyan Xu, Chenghao Cai, Dengfeng Ke, Kaile Su","doi":"10.1515/jisys-2022-2014","DOIUrl":null,"url":null,"abstract":"Abstract Named entity recognition (NER) is the localization and classification of entities with specific meanings in text data, usually used for applications such as relation extraction, question answering, etc. Chinese is a language with Chinese characters as the basic unit, but a Chinese named entity is normally a word containing several characters, so both the relationships between words and those between characters play an important role in Chinese NER. At present, a large number of studies have demonstrated that reasonable word information can effectively improve deep learning models for Chinese NER. Besides, graph convolution can help deep learning models perform better for sequence labeling. Therefore, in this article, we combine word information and graph convolution and propose our Lattice-Transformer-Graph (LTG) deep learning model for Chinese NER. The proposed model pays more attention to additional word information through position-attention, and therefore can learn relationships between characters by using lattice-transformer. Moreover, the adapted graph convolutional layer enables the model to learn both richer character relationships and word relationships and hence helps to recognize Chinese named entities better. Our experiments show that compared with 12 other state-of-the-art models, LTG achieves the best results on the public datasets of Microsoft Research Asia, Resume, and WeiboNER, with the F1 score of 95.89%, 96.81%, and 72.32%, respectively.","PeriodicalId":46139,"journal":{"name":"Journal of Intelligent Systems","volume":"1 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/jisys-2022-2014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 1

Abstract

Abstract Named entity recognition (NER) is the localization and classification of entities with specific meanings in text data, usually used for applications such as relation extraction, question answering, etc. Chinese is a language with Chinese characters as the basic unit, but a Chinese named entity is normally a word containing several characters, so both the relationships between words and those between characters play an important role in Chinese NER. At present, a large number of studies have demonstrated that reasonable word information can effectively improve deep learning models for Chinese NER. Besides, graph convolution can help deep learning models perform better for sequence labeling. Therefore, in this article, we combine word information and graph convolution and propose our Lattice-Transformer-Graph (LTG) deep learning model for Chinese NER. The proposed model pays more attention to additional word information through position-attention, and therefore can learn relationships between characters by using lattice-transformer. Moreover, the adapted graph convolutional layer enables the model to learn both richer character relationships and word relationships and hence helps to recognize Chinese named entities better. Our experiments show that compared with 12 other state-of-the-art models, LTG achieves the best results on the public datasets of Microsoft Research Asia, Resume, and WeiboNER, with the F1 score of 95.89%, 96.81%, and 72.32%, respectively.

查看原文本刊更多论文

中文命名实体识别的格-变换-图深度学习模型

命名实体识别(NER)是对文本数据中具有特定含义的实体进行定位和分类，通常用于关系提取、问题回答等应用。汉语是一种以汉字为基本单位的语言，但汉语命名实体通常是一个包含多个汉字的词，因此词与字之间的关系在汉语的NER中都起着重要的作用。目前已有大量研究表明，合理的词信息可以有效地改进中文NER的深度学习模型。此外，图卷积可以帮助深度学习模型更好地进行序列标记。因此，在本文中，我们将词信息和图卷积结合起来，提出了我们的网格-变换-图(LTG)深度学习模型。该模型通过位置注意来关注额外的单词信息，因此可以使用格变换来学习字符之间的关系。此外，自适应的图卷积层使模型能够学习更丰富的字符关系和单词关系，从而有助于更好地识别中文命名实体。我们的实验表明，与其他12个最先进的模型相比，LTG模型在微软亚洲研究院、Resume和WeiboNER的公共数据集上取得了最好的结果，F1得分分别为95.89%、96.81%和72.32%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

5.90

自引率

3.30%

发文量

审稿时长

51 weeks

期刊介绍： The Journal of Intelligent Systems aims to provide research and review papers, as well as Brief Communications at an interdisciplinary level, with the field of intelligent systems providing the focal point. This field includes areas like artificial intelligence, models and computational theories of human cognition, perception and motivation; brain models, artificial neural nets and neural computing. It covers contributions from the social, human and computer sciences to the analysis and application of information technology.