A Language-Agnostic Framework with Bidirectional Syntactic Graph Convolutional Networks for Cross-Lingual Aspect Term Extraction

IF 0.9 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Scalable Computing-Practice and Experience Pub Date : 2022-12-01 DOI:10.1109/SmartWorld-UIC-ATC-ScalCom-DigitalTwin-PriComp-Metaverse56740.2022.00215

Yaxin Cui, Baojie Tian, Junlin Wang, Yan Zhou, Songlin Hu

{"title":"A Language-Agnostic Framework with Bidirectional Syntactic Graph Convolutional Networks for Cross-Lingual Aspect Term Extraction","authors":"Yaxin Cui, Baojie Tian, Junlin Wang, Yan Zhou, Songlin Hu","doi":"10.1109/SmartWorld-UIC-ATC-ScalCom-DigitalTwin-PriComp-Metaverse56740.2022.00215","DOIUrl":null,"url":null,"abstract":"Aspect term extraction is a vital sub-task of sentiment analysis, which aims to extract explicit product attributes in customer reviews. Unfortunately, many languages lack sufficient labeled data, so researchers focus on Cross-lingual Aspect Term Extraction (XATE) to fully use sufficient data in other languages. Most recent cross-lingual methods focus on semantic alignment and data augmentation, but lack research on language structure, including syntax and lexicality. To this end, we propose a Language-Agnostic framework with Bidirectional Syntactic Graph Convolutional Networks (LA-BSGCN) for XATE. It is based on the idea that the topological structures of syntactic dependencies and the lexical tags across different languages are similar. We design a multi-layer bidirectional GCN, which can encode the syntactic tree more accurately. Furthermore, to reduce the lexicality semantic gap between different languages, we encode named entity recognition (NER) and part of speech (POS) information into our model. We conduct six pairs of cross-lingual experiments on SemEval2016 Task5 datasets. The results show that our LA-BSGCN significantly reduces the semantic gap and outperforms the state-of-the-art methods. For reproducibility, our code for this paper is available at github.","PeriodicalId":43791,"journal":{"name":"Scalable Computing-Practice and Experience","volume":"33 1","pages":"1488-1495"},"PeriodicalIF":0.9000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scalable Computing-Practice and Experience","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SmartWorld-UIC-ATC-ScalCom-DigitalTwin-PriComp-Metaverse56740.2022.00215","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Aspect term extraction is a vital sub-task of sentiment analysis, which aims to extract explicit product attributes in customer reviews. Unfortunately, many languages lack sufficient labeled data, so researchers focus on Cross-lingual Aspect Term Extraction (XATE) to fully use sufficient data in other languages. Most recent cross-lingual methods focus on semantic alignment and data augmentation, but lack research on language structure, including syntax and lexicality. To this end, we propose a Language-Agnostic framework with Bidirectional Syntactic Graph Convolutional Networks (LA-BSGCN) for XATE. It is based on the idea that the topological structures of syntactic dependencies and the lexical tags across different languages are similar. We design a multi-layer bidirectional GCN, which can encode the syntactic tree more accurately. Furthermore, to reduce the lexicality semantic gap between different languages, we encode named entity recognition (NER) and part of speech (POS) information into our model. We conduct six pairs of cross-lingual experiments on SemEval2016 Task5 datasets. The results show that our LA-BSGCN significantly reduces the semantic gap and outperforms the state-of-the-art methods. For reproducibility, our code for this paper is available at github.

查看原文本刊更多论文

基于双向句法图卷积网络的跨语言方面词提取语言不可知框架

方面词提取是情感分析的重要子任务，其目的是提取客户评论中明确的产品属性。遗憾的是，许多语言缺乏足够的标记数据，因此研究人员将重点放在跨语言方面术语提取(XATE)上，以充分利用其他语言的足够数据。最近的跨语言方法主要集中在语义对齐和数据增强上，但缺乏对语言结构的研究，包括语法和词法。为此，我们提出了一个基于双向语法图卷积网络(LA-BSGCN)的语言不可知框架。它基于不同语言之间语法依赖关系和词法标记的拓扑结构相似的思想。我们设计了一个多层双向GCN，可以更准确地编码语法树。此外，为了减少不同语言之间的词法语义差距，我们将命名实体识别(NER)和词性识别(POS)信息编码到模型中。我们在SemEval2016 Task5数据集上进行了六对跨语言实验。结果表明，我们的LA-BSGCN显著减小了语义差距，优于目前最先进的方法。为了再现性，本文的代码可以在github上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Scalable Computing-Practice and Experience COMPUTER SCIENCE, SOFTWARE ENGINEERING-

CiteScore

2.00

自引率

0.00%

发文量

期刊介绍： The area of scalable computing has matured and reached a point where new issues and trends require a professional forum. SCPE will provide this avenue by publishing original refereed papers that address the present as well as the future of parallel and distributed computing. The journal will focus on algorithm development, implementation and execution on real-world parallel architectures, and application of parallel and distributed computing to the solution of real-life problems.