{"title":"A Context-Enhanced Generate-then-Evaluate Framework for Chinese Abbreviation Prediction","authors":"Hanwen Tong, Chenhao Xie, Jiaqing Liang, Qi He, Zhiang Yue, Jingping Liu, Yanghua Xiao, Wenguang Wang","doi":"10.1145/3511808.3557219","DOIUrl":null,"url":null,"abstract":"As a popular form of lexicalization, abbreviation is widely used in both oral and written language and plays an important role in various Natural Language Processing applications. However, current approaches cannot ensure that the predicted abbreviation preserves the meaning of its full form and maintains fluency. In this paper, we introduce a fresh perspective to evaluate the quality of abbreviations within their textual contexts with pre-trained language model. To this end, we propose a novel two-stage generate-then-evaluate framework enhanced by context, which consists of a generation model to generate multiple candidate abbreviations and an evaluation model to evaluate their quality within their contexts. Experimental results show that our framework consistently outperforms all the existing approaches, achieving 53.2% Hit@1 performance with a 5.6 points improvement compared to its previous best result. Our code and data are publicly available at https://github.com/HavenTong/CEGE.","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3511808.3557219","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As a popular form of lexicalization, abbreviation is widely used in both oral and written language and plays an important role in various Natural Language Processing applications. However, current approaches cannot ensure that the predicted abbreviation preserves the meaning of its full form and maintains fluency. In this paper, we introduce a fresh perspective to evaluate the quality of abbreviations within their textual contexts with pre-trained language model. To this end, we propose a novel two-stage generate-then-evaluate framework enhanced by context, which consists of a generation model to generate multiple candidate abbreviations and an evaluation model to evaluate their quality within their contexts. Experimental results show that our framework consistently outperforms all the existing approaches, achieving 53.2% Hit@1 performance with a 5.6 points improvement compared to its previous best result. Our code and data are publicly available at https://github.com/HavenTong/CEGE.