Self-supervised Short-text Modeling through Auxiliary Context Generation

ACM Transactions on Intelligent Systems and Technology (TIST) Pub Date : 2022-04-12 DOI:10.1145/3511712

Nurendra Choudhary, C. Aggarwal

{"title":"Self-supervised Short-text Modeling through Auxiliary Context Generation","authors":"Nurendra Choudhary, C. Aggarwal","doi":"10.1145/3511712","DOIUrl":null,"url":null,"abstract":"Short text is ambiguous and often relies predominantly on the domain and context at hand in order to attain semantic relevance. Existing classification models perform poorly on short text due to data sparsity and inadequate context. Auxiliary context, which can often provide sufficient background regarding the domain, is typically available in several application scenarios. While some of the existing works aim to leverage real-world knowledge to enhance short-text representations, they fail to place appropriate emphasis on the auxiliary context. Such models do not harness the full potential of the available context in auxiliary sources. To address this challenge, we reformulate short-text classification as a dual channel self-supervised learning problem (that leverages auxiliary context) with a generation network and a corresponding prediction model. We propose a self-supervised framework, Pseudo-Auxiliary Context generation network for Short-text Modeling (PACS), to comprehensively leverage auxiliary context and it is jointly learned with a prediction network in an end-to-end manner. Our PACS model consists of two sub-networks: a Context Generation Network (CGN) that models the auxiliary context’s distribution and a Prediction Network (PN) to map the short-text features and auxiliary context distribution to the final class label. Our experimental results on diverse datasets demonstrate that PACS outperforms formidable state-of-the-art baselines. We also demonstrate the performance of our model on cold-start scenarios (where contextual information is non-existent) during prediction. Furthermore, we perform interpretability and ablation studies to analyze various representational features captured by our model and the individual contribution of its modules to the overall performance of PACS, respectively.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology (TIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3511712","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Short text is ambiguous and often relies predominantly on the domain and context at hand in order to attain semantic relevance. Existing classification models perform poorly on short text due to data sparsity and inadequate context. Auxiliary context, which can often provide sufficient background regarding the domain, is typically available in several application scenarios. While some of the existing works aim to leverage real-world knowledge to enhance short-text representations, they fail to place appropriate emphasis on the auxiliary context. Such models do not harness the full potential of the available context in auxiliary sources. To address this challenge, we reformulate short-text classification as a dual channel self-supervised learning problem (that leverages auxiliary context) with a generation network and a corresponding prediction model. We propose a self-supervised framework, Pseudo-Auxiliary Context generation network for Short-text Modeling (PACS), to comprehensively leverage auxiliary context and it is jointly learned with a prediction network in an end-to-end manner. Our PACS model consists of two sub-networks: a Context Generation Network (CGN) that models the auxiliary context’s distribution and a Prediction Network (PN) to map the short-text features and auxiliary context distribution to the final class label. Our experimental results on diverse datasets demonstrate that PACS outperforms formidable state-of-the-art baselines. We also demonstrate the performance of our model on cold-start scenarios (where contextual information is non-existent) during prediction. Furthermore, we perform interpretability and ablation studies to analyze various representational features captured by our model and the individual contribution of its modules to the overall performance of PACS, respectively.

查看原文本刊更多论文

基于辅助上下文生成的自监督短文本建模

短文本具有歧义性，通常主要依靠领域和上下文来获得语义关联。由于数据稀疏性和上下文不充分，现有的分类模型在短文本上表现不佳。辅助上下文通常可以提供有关领域的充分背景，通常在几个应用程序场景中可用。虽然现有的一些工作旨在利用现实世界的知识来增强短文本表示，但它们未能适当强调辅助上下文。这样的模型不能充分利用辅助源中可用上下文的潜力。为了解决这一挑战，我们将短文本分类重新制定为双通道自监督学习问题(利用辅助上下文)，并使用生成网络和相应的预测模型。为了综合利用辅助上下文，我们提出了一个自监督框架——伪辅助上下文生成网络(Pseudo-Auxiliary Context generation network for Short-text Modeling, PACS)，并以端到端方式与预测网络共同学习。我们的PACS模型由两个子网络组成:一个是对辅助上下文分布建模的上下文生成网络(CGN)，另一个是将短文本特征和辅助上下文分布映射到最终类标签的预测网络(PN)。我们在不同数据集上的实验结果表明，PACS优于最先进的基线。我们还在预测期间演示了我们的模型在冷启动场景(上下文信息不存在)上的性能。此外，我们进行了可解释性和消融性研究，分别分析了我们的模型捕获的各种代表性特征以及其模块对PACS整体性能的单个贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Intelligent Systems and Technology (TIST)

自引率

0.00%

发文量