Karaoke Generation from songs: recent trends and opportunities

2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2022-11-07 DOI:10.23919/APSIPAASC55919.2022.9980133

Preet Patel, Ansh Ray, Khushboo Thakkar, Kahan Sheth, Sapan H. Mankad

{"title":"Karaoke Generation from songs: recent trends and opportunities","authors":"Preet Patel, Ansh Ray, Khushboo Thakkar, Kahan Sheth, Sapan H. Mankad","doi":"10.23919/APSIPAASC55919.2022.9980133","DOIUrl":null,"url":null,"abstract":"Music Information Retrieval is a crucial task which has ample opportunities in Music Industries. Currently, audio engineers have to create custom karaoke tracks manually for songs. The technique of producing a high-quality karaoke track for a song is not accessible to the public. Audacity and other specialised software must be needed to generate karaoke. In this work, we review different methods and approaches, which give a high-quality karaoke track by presenting a simple and quick separation of vocals from a given song with both vocal and instrumental components. It does not need the use of any specific audio processing software. We review techniques and approaches for generating karaoke such as Spleeter, Hybrid Demucs, D3Net, Open-Unmix, Sams-Net etc. These approaches are based on current state-of-the-art machine learning and deep learning techniques. We believe that this review will serve the purpose as a good resource for researchers working in this field.","PeriodicalId":382967,"journal":{"name":"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/APSIPAASC55919.2022.9980133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Music Information Retrieval is a crucial task which has ample opportunities in Music Industries. Currently, audio engineers have to create custom karaoke tracks manually for songs. The technique of producing a high-quality karaoke track for a song is not accessible to the public. Audacity and other specialised software must be needed to generate karaoke. In this work, we review different methods and approaches, which give a high-quality karaoke track by presenting a simple and quick separation of vocals from a given song with both vocal and instrumental components. It does not need the use of any specific audio processing software. We review techniques and approaches for generating karaoke such as Spleeter, Hybrid Demucs, D3Net, Open-Unmix, Sams-Net etc. These approaches are based on current state-of-the-art machine learning and deep learning techniques. We believe that this review will serve the purpose as a good resource for researchers working in this field.

查看原文本刊更多论文

卡拉ok一代:最近的趋势和机遇

音乐信息检索是音乐产业的一项重要任务，具有广阔的发展前景。目前，音频工程师必须手动为歌曲创建自定义卡拉ok音轨。为歌曲制作高质量的卡拉ok音轨的技术还没有普及。生成卡拉ok必须需要Audacity和其他专业软件。在这项工作中，我们回顾了不同的方法和途径，这些方法和途径通过简单而快速地将人声从给定的歌曲中分离出来，同时具有人声和器乐成分，从而获得高质量的卡拉ok音轨。它不需要使用任何特定的音频处理软件。我们回顾了生成卡拉ok的技术和方法，如Spleeter, Hybrid Demucs, D3Net, Open-Unmix, sam - net等。这些方法基于当前最先进的机器学习和深度学习技术。我们相信这篇综述将为在这一领域工作的研究人员提供一个很好的资源。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

自引率

0.00%

发文量