一种用于动态时间规整的VLSI硬件加速器

Q4 Computer Science

模式识别与人工智能 Pub Date : 1992-08-30 DOI:10.1109/ICPR.1992.202121

V. Sundaresan, S. Nichani, N. Ranganathan, R. Sankar

{"title":"一种用于动态时间规整的VLSI硬件加速器","authors":"V. Sundaresan, S. Nichani, N. Ranganathan, R. Sankar","doi":"10.1109/ICPR.1992.202121","DOIUrl":null,"url":null,"abstract":"Describes an area and time efficient systolic array architecture for computations in Dynamic Time Warping (DTW). The special purpose architecture is used to perform the band matrix multiplication in order to compute the local distance metric based on Itakura's log likelihood distance. The time complexity of the algorithm is O(nk) where n and k are the number of elements in the row of the first and second input matrices. The number of processors is equal to the bandwidth w of the output band matrix. The speedup of the parallel algorithm compared to the sequential algorithm is wz where z is the multiplier stages within a PE. The parallel algorithm can be implemented as a single VLSI chip.<<ETX>>","PeriodicalId":34917,"journal":{"name":"模式识别与人工智能","volume":"23 1","pages":"27-30"},"PeriodicalIF":0.0000,"publicationDate":"1992-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"A VLSI hardware accelerator for dynamic time warping\",\"authors\":\"V. Sundaresan, S. Nichani, N. Ranganathan, R. Sankar\",\"doi\":\"10.1109/ICPR.1992.202121\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Describes an area and time efficient systolic array architecture for computations in Dynamic Time Warping (DTW). The special purpose architecture is used to perform the band matrix multiplication in order to compute the local distance metric based on Itakura's log likelihood distance. The time complexity of the algorithm is O(nk) where n and k are the number of elements in the row of the first and second input matrices. The number of processors is equal to the bandwidth w of the output band matrix. The speedup of the parallel algorithm compared to the sequential algorithm is wz where z is the multiplier stages within a PE. The parallel algorithm can be implemented as a single VLSI chip.<<ETX>>\",\"PeriodicalId\":34917,\"journal\":{\"name\":\"模式识别与人工智能\",\"volume\":\"23 1\",\"pages\":\"27-30\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1992-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"模式识别与人工智能\",\"FirstCategoryId\":\"1093\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPR.1992.202121\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"模式识别与人工智能","FirstCategoryId":"1093","ListUrlMain":"https://doi.org/10.1109/ICPR.1992.202121","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 6

摘要

描述用于动态时间翘曲(DTW)计算的面积和时间效率高的收缩阵列架构。基于Itakura对数似然距离计算局部距离度量，采用专用结构进行带矩阵乘法。算法的时间复杂度为O(nk)，其中n和k分别为第一个和第二个输入矩阵的行中元素的个数。处理器的数量等于输出频带矩阵的带宽w。与顺序算法相比，并行算法的加速是wz，其中z是PE内的乘法器阶段。该并行算法可以在单个VLSI芯片上实现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A VLSI hardware accelerator for dynamic time warping

Describes an area and time efficient systolic array architecture for computations in Dynamic Time Warping (DTW). The special purpose architecture is used to perform the band matrix multiplication in order to compute the local distance metric based on Itakura's log likelihood distance. The time complexity of the algorithm is O(nk) where n and k are the number of elements in the row of the first and second input matrices. The number of processors is equal to the bandwidth w of the output band matrix. The speedup of the parallel algorithm compared to the sequential algorithm is wz where z is the multiplier stages within a PE. The parallel algorithm can be implemented as a single VLSI chip.<>

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

模式识别与人工智能 Computer Science-Artificial Intelligence

CiteScore

1.60

自引率

0.00%

发文量

3316

期刊介绍：