{"title":"Autoregressive Self-Evaluation: A Case Study of Music Generation Using Large Language Models","authors":"Berker Banar, S. Colton","doi":"10.1109/CAI54212.2023.00118","DOIUrl":null,"url":null,"abstract":"Autoregressive models have shown significant success in many tasks such as natural language generation and music composition. However, generic training mechanisms with off-the-shelf loss functions (e.g. cross-entropy), where not much attention is paid to the specifics of the task, do not necessarily guarantee success as different data modalities (e.g. text, visuals, music) exhibit different natures. In this study, we present a novel autoregressive self-evaluation framework to assess the performance of autoregressive models with both domain-agnostic and domain-specific metrics. We demonstrate this strategy with a case study of music generation using GPT-2 within a transfer learning paradigm. We contrast and compare the effects of fundamental parameters in autoregressive generation such as the temperature in sampling and the length of the generated sequence.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Conference on Artificial Intelligence (CAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAI54212.2023.00118","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Autoregressive models have shown significant success in many tasks such as natural language generation and music composition. However, generic training mechanisms with off-the-shelf loss functions (e.g. cross-entropy), where not much attention is paid to the specifics of the task, do not necessarily guarantee success as different data modalities (e.g. text, visuals, music) exhibit different natures. In this study, we present a novel autoregressive self-evaluation framework to assess the performance of autoregressive models with both domain-agnostic and domain-specific metrics. We demonstrate this strategy with a case study of music generation using GPT-2 within a transfer learning paradigm. We contrast and compare the effects of fundamental parameters in autoregressive generation such as the temperature in sampling and the length of the generated sequence.