SONGS CONTINUATION GENERATION TECHNOLOGY BASED ON TEST GENERATION STRATEGIES, TEXTMINING AND LANGUAGE MODEL T5

O. Mediakov, V. Vysotska
{"title":"SONGS CONTINUATION GENERATION TECHNOLOGY BASED ON TEST GENERATION STRATEGIES, TEXTMINING AND LANGUAGE MODEL T5","authors":"O. Mediakov, V. Vysotska","doi":"10.15588/1607-3274-2023-4-15","DOIUrl":null,"url":null,"abstract":"Context. Pre-trained large language models are currently the driving force behind the development of not only NLP, but also deep learning systems in general. Model transformers are able to solve virtually all problems that currently exist, provided that certain requirements and training practices are met. In turn, words, sentences and texts are the basic and most important way of communication between intellectually developed beings. Of course, speech and texts are used to convey certain emotions, events, etc. One of the main ways of using language to describe experienced emotions is songs with lyrics. However, often due to the need to preserve rhyme and rhyming, the dimensions of verse lines, song structure, etc., artists have to use repetition of lines in the lyrics. In addition, the process of writing texts can be long. \nObjective of the study is to develop information technology for generating the continuation of song texts based on the T5 machine learning model with (SA, specific author) and without (NSA, non-specific author) consideration of the author's style. \nMethod. Choosing a decoding strategy is important for the generation process. However, instead of favoring a particular strategy, the system will support multiple strategies. In particular, the following 8 strategies: Contrastive search, Top-p sampling, Top-k sampling, Multinomial sampling, Beam search, Diverse beam search, Greedy search, and Beam-search multinomial sampling. \nResults. A machine learning model was developed to generate the continuation of song lyrics using large language models, in particular the T5 model, to accelerate, complement and increase the flexibility of the songwriting process. \nConclusions. The created model shows excellent results of generating the continuation of song texts on test data. Analysis of the raw data showed that the NSA model has less degrading results, while the SA model needs to balance the amount of text for each author. Several text metrics such as BLEU, RougeL and RougeN are calculated to quantitatively compare the results of the models and generation strategies. The value of the BLEU metric is the most variable, and its value varies significantly depending on the strategy. At the same time, Rouge metrics have less variability, a smaller range of values. For comparison, 8 different decoding methods for text generation, supported by the transformers library, were used. From all the results of the text comparison, it is clear that the metrically best method of song text generation is beam search and its variations, in particular beam sampling. Contrastive search usually outperformed the conventional greedy approach. The top-p and top-k methods are not clearly superior to each other, and in different situations gave different results.","PeriodicalId":518330,"journal":{"name":"Radio Electronics, Computer Science, Control","volume":"17 8","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radio Electronics, Computer Science, Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15588/1607-3274-2023-4-15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Context. Pre-trained large language models are currently the driving force behind the development of not only NLP, but also deep learning systems in general. Model transformers are able to solve virtually all problems that currently exist, provided that certain requirements and training practices are met. In turn, words, sentences and texts are the basic and most important way of communication between intellectually developed beings. Of course, speech and texts are used to convey certain emotions, events, etc. One of the main ways of using language to describe experienced emotions is songs with lyrics. However, often due to the need to preserve rhyme and rhyming, the dimensions of verse lines, song structure, etc., artists have to use repetition of lines in the lyrics. In addition, the process of writing texts can be long. Objective of the study is to develop information technology for generating the continuation of song texts based on the T5 machine learning model with (SA, specific author) and without (NSA, non-specific author) consideration of the author's style. Method. Choosing a decoding strategy is important for the generation process. However, instead of favoring a particular strategy, the system will support multiple strategies. In particular, the following 8 strategies: Contrastive search, Top-p sampling, Top-k sampling, Multinomial sampling, Beam search, Diverse beam search, Greedy search, and Beam-search multinomial sampling. Results. A machine learning model was developed to generate the continuation of song lyrics using large language models, in particular the T5 model, to accelerate, complement and increase the flexibility of the songwriting process. Conclusions. The created model shows excellent results of generating the continuation of song texts on test data. Analysis of the raw data showed that the NSA model has less degrading results, while the SA model needs to balance the amount of text for each author. Several text metrics such as BLEU, RougeL and RougeN are calculated to quantitatively compare the results of the models and generation strategies. The value of the BLEU metric is the most variable, and its value varies significantly depending on the strategy. At the same time, Rouge metrics have less variability, a smaller range of values. For comparison, 8 different decoding methods for text generation, supported by the transformers library, were used. From all the results of the text comparison, it is clear that the metrically best method of song text generation is beam search and its variations, in particular beam sampling. Contrastive search usually outperformed the conventional greedy approach. The top-p and top-k methods are not clearly superior to each other, and in different situations gave different results.
基于测试生成策略、文本挖掘和语言模型的歌曲续写生成技术5
背景。目前,预训练的大型语言模型不仅是 NLP 的发展动力,也是深度学习系统的发展动力。只要满足一定的要求和训练实践,模型转换器几乎可以解决目前存在的所有问题。反过来,词语、句子和文本是智力发达的人类之间最基本、最重要的交流方式。当然,语言和文字也用来传递某些情感、事件等。用语言描述体验到的情感的主要方式之一就是歌词歌曲。然而,往往由于需要保留韵律和押韵、诗行的尺寸、歌曲结构等原因,艺术家们不得不在歌词中使用重复的句子。此外,撰写文本的过程可能会很漫长。本研究的目的是开发基于 T5 机器学习模型的歌词续写生成信息技术,该模型包含(SA,特定作者)和不包含(NSA,非特定作者)作者风格的考虑。方法选择解码策略对生成过程非常重要。不过,系统将支持多种策略,而不是偏向于某一特定策略。特别是以下 8 种策略:对比搜索、Top-p 抽样、Top-k 抽样、多项式抽样、波束搜索、多样化波束搜索、贪婪搜索和波束搜索多项式抽样。结果。开发了一种机器学习模型,利用大型语言模型,特别是 T5 模型生成歌词续写,从而加快、补充和提高歌曲创作过程的灵活性。结论。所创建的模型在测试数据上生成歌词续写文本方面显示出出色的效果。对原始数据的分析表明,NSA 模型的结果退化较小,而 SA 模型则需要平衡每个作者的文本数量。计算了几个文本指标,如 BLEU、RougeL 和 RougeN,以定量比较模型和生成策略的结果。BLEU 指标的值变化最大,其值随策略的不同而变化很大。同时,Rouge 指标的可变性较小,取值范围也较小。为了进行比较,我们使用了转换器库支持的 8 种不同的文本生成解码方法。从文本比较的所有结果来看,歌曲文本生成的最佳度量方法显然是波束搜索及其变体,特别是波束采样。对比搜索法的性能通常优于传统的贪婪法。top-p 和 top-k 方法的优劣并不明显,在不同的情况下结果也不同。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信