基于转换器的分类充实和句子分类模型

Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP) Pub Date : 1900-01-01 DOI:10.18653/v1/2022.finnlp-1.34

Parag Dakle, Shrikumar Patil, Sai Krishna Rallabandi, Chaitra V. Hegde, Preethi Raghavan

{"title":"基于转换器的分类充实和句子分类模型","authors":"Parag Dakle, Shrikumar Patil, Sai Krishna Rallabandi, Chaitra V. Hegde, Preethi Raghavan","doi":"10.18653/v1/2022.finnlp-1.34","DOIUrl":null,"url":null,"abstract":"In this paper, we present a system that addresses the taxonomy enrichment problem for Environment, Social and Governance issues in the financial domain, as well as classifying sentences as sustainable or unsustainable, for FinSim4-ESG, a shared task for the FinNLP workshop at IJCAI-2022. We first created a derived dataset for taxonomy enrichment by using a sentence-BERT-based paraphrase detector (Reimers and Gurevych, 2019) (on the train set) to create positive and negative term-concept pairs. We then model the problem by fine-tuning the sentence-BERT-based paraphrase detector on this derived dataset, and use it as the encoder, and use a Logistic Regression classifier as the decoder, resulting in test Accuracy: 0.6 and Avg. Rank: 1.97. In case of the sentence classification task, the best-performing classifier (Accuracy: 0.92) consists of a pre-trained RoBERTa model (Liu et al., 2019a) as the encoder and a Feed Forward Neural Network classifier as the decoder.","PeriodicalId":331851,"journal":{"name":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Using Transformer-based Models for Taxonomy Enrichment and Sentence Classification\",\"authors\":\"Parag Dakle, Shrikumar Patil, Sai Krishna Rallabandi, Chaitra V. Hegde, Preethi Raghavan\",\"doi\":\"10.18653/v1/2022.finnlp-1.34\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a system that addresses the taxonomy enrichment problem for Environment, Social and Governance issues in the financial domain, as well as classifying sentences as sustainable or unsustainable, for FinSim4-ESG, a shared task for the FinNLP workshop at IJCAI-2022. We first created a derived dataset for taxonomy enrichment by using a sentence-BERT-based paraphrase detector (Reimers and Gurevych, 2019) (on the train set) to create positive and negative term-concept pairs. We then model the problem by fine-tuning the sentence-BERT-based paraphrase detector on this derived dataset, and use it as the encoder, and use a Logistic Regression classifier as the decoder, resulting in test Accuracy: 0.6 and Avg. Rank: 1.97. In case of the sentence classification task, the best-performing classifier (Accuracy: 0.92) consists of a pre-trained RoBERTa model (Liu et al., 2019a) as the encoder and a Feed Forward Neural Network classifier as the decoder.\",\"PeriodicalId\":331851,\"journal\":{\"name\":\"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/2022.finnlp-1.34\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2022.finnlp-1.34","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在本文中，我们提出了一个系统，用于FinSim4-ESG解决金融领域环境、社会和治理问题的分类丰富问题，以及将句子分类为可持续或不可持续，这是IJCAI-2022 FinNLP研讨会的共同任务。我们首先使用基于句子bert的释义检测器(Reimers和Gurevych, 2019)(在训练集上)创建了一个派生数据集，用于分类丰富，以创建积极和消极的术语概念对。然后，我们通过在该衍生数据集上微调基于句子bert的释义检测器来建模问题，并将其用作编码器，并使用逻辑回归分类器作为解码器，从而得到测试精度:0.6和平均秩:1.97。在句子分类任务中，表现最好的分类器(准确率:0.92)由预训练的RoBERTa模型(Liu et al.， 2019a)作为编码器和前馈神经网络分类器作为解码器组成。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Using Transformer-based Models for Taxonomy Enrichment and Sentence Classification

In this paper, we present a system that addresses the taxonomy enrichment problem for Environment, Social and Governance issues in the financial domain, as well as classifying sentences as sustainable or unsustainable, for FinSim4-ESG, a shared task for the FinNLP workshop at IJCAI-2022. We first created a derived dataset for taxonomy enrichment by using a sentence-BERT-based paraphrase detector (Reimers and Gurevych, 2019) (on the train set) to create positive and negative term-concept pairs. We then model the problem by fine-tuning the sentence-BERT-based paraphrase detector on this derived dataset, and use it as the encoder, and use a Logistic Regression classifier as the decoder, resulting in test Accuracy: 0.6 and Avg. Rank: 1.97. In case of the sentence classification task, the best-performing classifier (Accuracy: 0.92) consists of a pre-trained RoBERTa model (Liu et al., 2019a) as the encoder and a Feed Forward Neural Network classifier as the decoder.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)

自引率

0.00%

发文量