改进科学文献分类:基于参数高效变压器的方法

IF 0.8 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC
Mohammad Munzir Ahanger, M. Arif Wani
{"title":"改进科学文献分类:基于参数高效变压器的方法","authors":"Mohammad Munzir Ahanger, M. Arif Wani","doi":"10.32985/ijeces.14.10.4","DOIUrl":null,"url":null,"abstract":"Transformer-based models have been utilized in natural language processing (NLP) for a wide variety of tasks like summarization, translation, and conversational agents. These models can capture long-term dependencies within the input, so they have significantly more representational capabilities than Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). Nevertheless, these models require significant computational resources in terms of high memory usage, and extensive training time. In this paper, we propose a novel document categorization model, with improved parameter efficiency that encodes text using a single, lightweight, multiheaded attention encoder block. The model also uses a hybrid word and position embedding to represent input tokens. The proposed model is evaluated for the Scientific Literature Classification task (SLC) and is compared with state-of-the-art models that have previously been applied to the task. Ten datasets of varying sizes and class distributions have been employed in the experiments. The proposed model shows significant performance improvements, with a high level of efficiency in terms of parameter and computation resource requirements as compared to other transformer-based models, and outperforms previously used methods.","PeriodicalId":41912,"journal":{"name":"International Journal of Electrical and Computer Engineering Systems","volume":"3 4","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving Scientific Literature Classification: A Parameter-Efficient Transformer-Based Approach\",\"authors\":\"Mohammad Munzir Ahanger, M. Arif Wani\",\"doi\":\"10.32985/ijeces.14.10.4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Transformer-based models have been utilized in natural language processing (NLP) for a wide variety of tasks like summarization, translation, and conversational agents. These models can capture long-term dependencies within the input, so they have significantly more representational capabilities than Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). Nevertheless, these models require significant computational resources in terms of high memory usage, and extensive training time. In this paper, we propose a novel document categorization model, with improved parameter efficiency that encodes text using a single, lightweight, multiheaded attention encoder block. The model also uses a hybrid word and position embedding to represent input tokens. The proposed model is evaluated for the Scientific Literature Classification task (SLC) and is compared with state-of-the-art models that have previously been applied to the task. Ten datasets of varying sizes and class distributions have been employed in the experiments. The proposed model shows significant performance improvements, with a high level of efficiency in terms of parameter and computation resource requirements as compared to other transformer-based models, and outperforms previously used methods.\",\"PeriodicalId\":41912,\"journal\":{\"name\":\"International Journal of Electrical and Computer Engineering Systems\",\"volume\":\"3 4\",\"pages\":\"\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2023-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Electrical and Computer Engineering Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32985/ijeces.14.10.4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Electrical and Computer Engineering Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32985/ijeces.14.10.4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

基于变换器的模型已被用于自然语言处理(NLP)中的各种任务,如摘要、翻译和对话代理。这些模型可以捕捉输入中的长期依赖关系,因此与卷积神经网络(CNN)和递归神经网络(RNN)相比,它们具有更强的表征能力。然而,这些模型需要大量的计算资源,内存使用率高,训练时间长。在本文中,我们提出了一种新颖的文档分类模型,该模型使用单个轻量级多头注意力编码器块对文本进行编码,提高了参数效率。该模型还使用混合词嵌入和位置嵌入来表示输入标记。我们针对科学文献分类任务(SLC)对所提出的模型进行了评估,并与之前应用于该任务的最先进模型进行了比较。实验中使用了十个不同规模和类别分布的数据集。与其他基于转换器的模型相比,所提出的模型在参数和计算资源需求方面具有很高的效率,性能也有了显著提高,优于以前使用的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improving Scientific Literature Classification: A Parameter-Efficient Transformer-Based Approach
Transformer-based models have been utilized in natural language processing (NLP) for a wide variety of tasks like summarization, translation, and conversational agents. These models can capture long-term dependencies within the input, so they have significantly more representational capabilities than Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). Nevertheless, these models require significant computational resources in terms of high memory usage, and extensive training time. In this paper, we propose a novel document categorization model, with improved parameter efficiency that encodes text using a single, lightweight, multiheaded attention encoder block. The model also uses a hybrid word and position embedding to represent input tokens. The proposed model is evaluated for the Scientific Literature Classification task (SLC) and is compared with state-of-the-art models that have previously been applied to the task. Ten datasets of varying sizes and class distributions have been employed in the experiments. The proposed model shows significant performance improvements, with a high level of efficiency in terms of parameter and computation resource requirements as compared to other transformer-based models, and outperforms previously used methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.20
自引率
11.80%
发文量
69
期刊介绍: The International Journal of Electrical and Computer Engineering Systems publishes original research in the form of full papers, case studies, reviews and surveys. It covers theory and application of electrical and computer engineering, synergy of computer systems and computational methods with electrical and electronic systems, as well as interdisciplinary research. Power systems Renewable electricity production Power electronics Electrical drives Industrial electronics Communication systems Advanced modulation techniques RFID devices and systems Signal and data processing Image processing Multimedia systems Microelectronics Instrumentation and measurement Control systems Robotics Modeling and simulation Modern computer architectures Computer networks Embedded systems High-performance computing Engineering education Parallel and distributed computer systems Human-computer systems Intelligent systems Multi-agent and holonic systems Real-time systems Software engineering Internet and web applications and systems Applications of computer systems in engineering and related disciplines Mathematical models of engineering systems Engineering management.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信