信息问答系统中变形语言模型性能的提高

Proceedings of the Southwest State University Pub Date : 2023-02-13 DOI:10.21869/2223-1560-2022-26-2-159-171

D. T. Galeev, V. Panishchev, D. V. Titov

{"title":"信息问答系统中变形语言模型性能的提高","authors":"D. T. Galeev, V. Panishchev, D. V. Titov","doi":"10.21869/2223-1560-2022-26-2-159-171","DOIUrl":null,"url":null,"abstract":"Purpose of research. The purpose of this work is to increase the performance of question and response information systems in Russian. Scientific novelty of the work is to increase the performance for RuBERT model, which was trained to find the answer to the question in the text. As far as a more efficient language model allows more requests to be processed in the same time, the results of this work can be used in various information question and response systems for which response speed is important.Methods. The present work uses methods of processing natural language, machine learning, reducing the size of artificial neural networks. The language model was configured and trained using Torch and Onnxruntime machine learning libraries. The original model and training dataset were taken from the Huggingface Library.Results. As a result of the study, the performance of RuBERT language model was increased using methods to reduce the size of neural networks, such as distillation of knowledge and quantization, as well as by exporting the model to ONNX format and running it in ONNX runtime.Conclusion. As a result, the model, to which knowledge distillation, quantization and ONNX optimization were simultaneously applied, received a performance increase of ~ 4.6 times (from 66.57 to 404.46 requests per minute), while the size of the model decreased ~ 13 times (from 676.29 MB to 51.66 MB). The downside of obtained performance was EM deterioration (from 61.3 to 56.87) and F-measure (from 81.66 to 76.97).","PeriodicalId":443878,"journal":{"name":"Proceedings of the Southwest State University","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Increased Performance of Transformers Language Models in Information Question and Response Systems\",\"authors\":\"D. T. Galeev, V. Panishchev, D. V. Titov\",\"doi\":\"10.21869/2223-1560-2022-26-2-159-171\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose of research. The purpose of this work is to increase the performance of question and response information systems in Russian. Scientific novelty of the work is to increase the performance for RuBERT model, which was trained to find the answer to the question in the text. As far as a more efficient language model allows more requests to be processed in the same time, the results of this work can be used in various information question and response systems for which response speed is important.Methods. The present work uses methods of processing natural language, machine learning, reducing the size of artificial neural networks. The language model was configured and trained using Torch and Onnxruntime machine learning libraries. The original model and training dataset were taken from the Huggingface Library.Results. As a result of the study, the performance of RuBERT language model was increased using methods to reduce the size of neural networks, such as distillation of knowledge and quantization, as well as by exporting the model to ONNX format and running it in ONNX runtime.Conclusion. As a result, the model, to which knowledge distillation, quantization and ONNX optimization were simultaneously applied, received a performance increase of ~ 4.6 times (from 66.57 to 404.46 requests per minute), while the size of the model decreased ~ 13 times (from 676.29 MB to 51.66 MB). The downside of obtained performance was EM deterioration (from 61.3 to 56.87) and F-measure (from 81.66 to 76.97).\",\"PeriodicalId\":443878,\"journal\":{\"name\":\"Proceedings of the Southwest State University\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Southwest State University\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21869/2223-1560-2022-26-2-159-171\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Southwest State University","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21869/2223-1560-2022-26-2-159-171","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

研究目的。这项工作的目的是提高俄语问答信息系统的性能。该工作的科学新颖之处在于提高了RuBERT模型的性能，该模型被训练来寻找文本中问题的答案。只要一个更有效的语言模型允许在同一时间处理更多的请求，本工作的结果就可以用于对响应速度很重要的各种信息问题和响应系统。目前的工作使用处理自然语言的方法，机器学习，减少人工神经网络的大小。使用Torch和Onnxruntime机器学习库配置和训练语言模型。原始模型和训练数据集取自Huggingface library。研究结果表明，采用知识蒸馏和量化等减小神经网络规模的方法，以及将RuBERT语言模型导出到ONNX格式并在ONNX运行时中运行，提高了RuBERT语言模型的性能。结果表明，同时应用知识蒸馏、量化和ONNX优化的模型的性能提高了约4.6倍(从66.57请求/分钟提高到404.46请求/分钟)，而模型的大小减少了约13倍(从676.29 MB降低到51.66 MB)。获得的性能的缺点是EM恶化(从61.3到56.87)和f测量(从81.66到76.97)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Increased Performance of Transformers Language Models in Information Question and Response Systems

Purpose of research. The purpose of this work is to increase the performance of question and response information systems in Russian. Scientific novelty of the work is to increase the performance for RuBERT model, which was trained to find the answer to the question in the text. As far as a more efficient language model allows more requests to be processed in the same time, the results of this work can be used in various information question and response systems for which response speed is important.Methods. The present work uses methods of processing natural language, machine learning, reducing the size of artificial neural networks. The language model was configured and trained using Torch and Onnxruntime machine learning libraries. The original model and training dataset were taken from the Huggingface Library.Results. As a result of the study, the performance of RuBERT language model was increased using methods to reduce the size of neural networks, such as distillation of knowledge and quantization, as well as by exporting the model to ONNX format and running it in ONNX runtime.Conclusion. As a result, the model, to which knowledge distillation, quantization and ONNX optimization were simultaneously applied, received a performance increase of ~ 4.6 times (from 66.57 to 404.46 requests per minute), while the size of the model decreased ~ 13 times (from 676.29 MB to 51.66 MB). The downside of obtained performance was EM deterioration (from 61.3 to 56.87) and F-measure (from 81.66 to 76.97).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Southwest State University

自引率

0.00%

发文量