Orbital Algorithms and Unified Array Processor for Computing 2D Separable Transforms

S. Sedukhin, A. Zekri, T. Miyazaki
{"title":"Orbital Algorithms and Unified Array Processor for Computing 2D Separable Transforms","authors":"S. Sedukhin, A. Zekri, T. Miyazaki","doi":"10.1109/ICPPW.2010.29","DOIUrl":null,"url":null,"abstract":"The two-dimensional (2D) forward/inverse discrete Fourier transform (DFT), discrete cosine transform (DCT), discrete sine transform (DST), discrete Hartley transform (DHT), discrete Walsh-Hadamard transform (DWHT), play a fundamental role in many practical applications. Due to the separability property, all these transforms can be uniquely defined as a triple matrix product with one matrix transposition. Based on a systematic approach to represent and schedule different forms of the $n\\times n$ matrix-matrix multiply-add (MMA) operation in 3D index space, we design new orbital highly-parallel/scalable algorithms and present an efficient $n\\times n$ unified array processor for computing {\\it any} $n\\times n$ forward/inverse discrete separable transform in the minimal $2n$ time-steps. Unlike traditional 2D systolic array processing, all $n^2$ register-stored elements of initial/intermediate matrices are processed simultaneously by all $n^2$ processing elements of the unified array processor at each time-step. Hence the proposed array processor is appropriate for applications with naturally arranged multidimensional data such as still images, video frames, 2D data from a matrix sensor, etc. Ultimately, we introduce a novel formulation and a highly-parallel implementation of the frequently required matrix data alignment and manipulation by using MMA operations on the same array processor so that no additional circuitry is needed.","PeriodicalId":415472,"journal":{"name":"2010 39th International Conference on Parallel Processing Workshops","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 39th International Conference on Parallel Processing Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPPW.2010.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

Abstract

The two-dimensional (2D) forward/inverse discrete Fourier transform (DFT), discrete cosine transform (DCT), discrete sine transform (DST), discrete Hartley transform (DHT), discrete Walsh-Hadamard transform (DWHT), play a fundamental role in many practical applications. Due to the separability property, all these transforms can be uniquely defined as a triple matrix product with one matrix transposition. Based on a systematic approach to represent and schedule different forms of the $n\times n$ matrix-matrix multiply-add (MMA) operation in 3D index space, we design new orbital highly-parallel/scalable algorithms and present an efficient $n\times n$ unified array processor for computing {\it any} $n\times n$ forward/inverse discrete separable transform in the minimal $2n$ time-steps. Unlike traditional 2D systolic array processing, all $n^2$ register-stored elements of initial/intermediate matrices are processed simultaneously by all $n^2$ processing elements of the unified array processor at each time-step. Hence the proposed array processor is appropriate for applications with naturally arranged multidimensional data such as still images, video frames, 2D data from a matrix sensor, etc. Ultimately, we introduce a novel formulation and a highly-parallel implementation of the frequently required matrix data alignment and manipulation by using MMA operations on the same array processor so that no additional circuitry is needed.
计算二维可分变换的轨道算法和统一阵列处理器
二维(2D)正/反离散傅立叶变换(DFT)、离散余弦变换(DCT)、离散正弦变换(DST)、离散哈特利变换(DHT)、离散Walsh-Hadamard变换(DWHT)在许多实际应用中起着基础作用。由于可分性,所有这些变换都可以唯一地定义为一个矩阵转置的三重矩阵积。基于系统地表示和调度三维索引空间中不同形式的矩阵-矩阵乘加(MMA)运算,我们设计了新的轨道高度并行/可扩展算法,并提出了一种高效的$n\times n$统一阵列处理器,用于在最小$2n$时间步内计算$n\times n$正/逆离散可分离变换。与传统的二维收缩数组处理不同,在每个时间步,统一数组处理器的所有$n^2$处理元素同时处理初始/中间矩阵的所有$n^2$寄存器存储元素。因此,所提出的阵列处理器适用于具有自然排列的多维数据的应用,例如静止图像、视频帧、来自矩阵传感器的2D数据等。最后,我们引入了一种新颖的公式和高度并行的实现,通过在同一阵列处理器上使用MMA操作来实现经常需要的矩阵数据对齐和操作,因此不需要额外的电路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信