An FPGA based parallel architecture for music melody matching

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI:10.1145/2435264.2435305

Hao Wang, Jyh-Charn S. Liu

{"title":"An FPGA based parallel architecture for music melody matching","authors":"Hao Wang, Jyh-Charn S. Liu","doi":"10.1145/2435264.2435305","DOIUrl":null,"url":null,"abstract":"We propose an FPGA-based high performance parallel architecture for music retrieval through singing. The database consists of monophonic MIDI files which are modeled into strings, and the user sung query is modeled as a set of regular expressions (regexp), with consideration of possible key transpositions and tempo variations to tolerate imperfectly sung queries. An approximate regexp matching algorithm is developed to calculate the similarity between a regexp and a string, using edit distance as the metrics. The algorithm supports user sung queries starting anywhere in the database song, not necessarily from the beginning. Using the proposed formal models and algorithms, the similarity between the user sung query and each song in the database can be evaluated and the top-10 most similar results will be reported. We designed the approximate regexp matching algorithm in such way that all terms of the regexp can execute concurrently, which perfectly fits the massive parallelism provided by FPGA. The FPGA implemented melody matching engine (MME) is a parameterized modular architecture that can be reconfigured to implement different regexps by simply updating their parameter registers, and can therefore avoid the time-consuming code re-synthesis. MME also includes an on-board DDR2 memory to store the database, so that they can be read in to calculate edit distances locally on the board. This way, each MME forms a self-contained system and multiple MMEs can be clustered to increase parallel processing power, with virtually no overhead. MME is evaluated using the query corpus of ThinkIT with 355 sung files and database of 5563 MIDI files. It achieves a top-10 hit rate of 90.7% and a runtime of 19.4 seconds, averaging 54.6 milliseconds for a single query. MME achieves significant speedup over software-based systems while providing the same level of flexibility.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"4 5 1","pages":"235-244"},"PeriodicalIF":0.0000,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2435264.2435305","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

We propose an FPGA-based high performance parallel architecture for music retrieval through singing. The database consists of monophonic MIDI files which are modeled into strings, and the user sung query is modeled as a set of regular expressions (regexp), with consideration of possible key transpositions and tempo variations to tolerate imperfectly sung queries. An approximate regexp matching algorithm is developed to calculate the similarity between a regexp and a string, using edit distance as the metrics. The algorithm supports user sung queries starting anywhere in the database song, not necessarily from the beginning. Using the proposed formal models and algorithms, the similarity between the user sung query and each song in the database can be evaluated and the top-10 most similar results will be reported. We designed the approximate regexp matching algorithm in such way that all terms of the regexp can execute concurrently, which perfectly fits the massive parallelism provided by FPGA. The FPGA implemented melody matching engine (MME) is a parameterized modular architecture that can be reconfigured to implement different regexps by simply updating their parameter registers, and can therefore avoid the time-consuming code re-synthesis. MME also includes an on-board DDR2 memory to store the database, so that they can be read in to calculate edit distances locally on the board. This way, each MME forms a self-contained system and multiple MMEs can be clustered to increase parallel processing power, with virtually no overhead. MME is evaluated using the query corpus of ThinkIT with 355 sung files and database of 5563 MIDI files. It achieves a top-10 hit rate of 90.7% and a runtime of 19.4 seconds, averaging 54.6 milliseconds for a single query. MME achieves significant speedup over software-based systems while providing the same level of flexibility.

查看原文本刊更多论文

基于FPGA的音乐旋律匹配并行架构

提出了一种基于fpga的歌唱音乐检索的高性能并行架构。数据库由单音MIDI文件组成，这些文件被建模为字符串，用户演唱查询被建模为一组正则表达式(regexp)，考虑到可能的键换位和速度变化，以容忍不完美的演唱查询。开发了一种近似的正则表达式匹配算法，以编辑距离作为度量来计算正则表达式与字符串之间的相似性。该算法支持从数据库歌曲的任何位置开始的用户查询，而不一定是从开头开始。使用提出的形式化模型和算法，可以评估用户演唱查询与数据库中每首歌曲之间的相似性，并报告最相似的前10个结果。我们设计了近似的regexp匹配算法，使得regexp的所有项都可以并行执行，这很好地适应了FPGA提供的大规模并行性。FPGA实现的旋律匹配引擎(MME)是一个参数化的模块化架构，可以通过简单地更新其参数寄存器来重新配置以实现不同的regexp，因此可以避免耗时的代码重新合成。MME还包括一个板载DDR2内存来存储数据库，这样它们就可以被读入来计算板上本地的编辑距离。通过这种方式，每个MME形成一个独立的系统，并且可以对多个MME进行集群，以增加并行处理能力，几乎没有任何开销。使用ThinkIT的查询语料库(包含355个sung文件)和数据库(包含5563个MIDI文件)对MME进行评估。它的前10名命中率为90.7%，运行时间为19.4秒，单个查询的平均运行时间为54.6毫秒。与基于软件的系统相比，MME实现了显著的加速，同时提供了相同级别的灵活性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

FPGA. ACM International Symposium on Field-Programmable Gate Arrays

自引率

0.00%

发文量