{"title":"A new compression ratio prediction algorithm for hardware implementations of LZW data compression","authors":"Alireza Yazdanpanah, M. Hashemi","doi":"10.1109/CADS.2010.5623592","DOIUrl":"https://doi.org/10.1109/CADS.2010.5623592","url":null,"abstract":"As the demand for more data storage space continues to grow at an unprecedented pace, the need for real time data compression systems becomes more prominent. One of the well-known methods among lossless data compression algorithms is LZW. In this paper, a new prediction algorithm has been introduced that is able to predict whether or not a data block is compressible with the LZW method. Furthermore, the prediction algorithm provides a reasonably good estimation of the final compression ratio, hence helping the storage system decide early on whether or not to continue with the compression. Simulation results performed on several data compression corpuses indicate that the proposed prediction method is able to reduce run time by 17.79%, in average for a flash sector size of 8 KB. This is achieved at the expense of an average 1.59% decrease in compression performance. The difference between the predicted and actual compression ratio is 11.16% in average, in terms of mean absolute error. Considering the low computational complexity of the proposed method, it can be implemented with a relatively simple hardware hence making it suitable for real time, low-cost, and low-power hardware implementations.","PeriodicalId":145317,"journal":{"name":"2010 15th CSI International Symposium on Computer Architecture and Digital Systems","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134298078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Separating conflict misses to reduce miss-rate of caches through an unified replacement policy","authors":"Hamed Azimi, A. Vafaei","doi":"10.1109/CADS.2010.5623586","DOIUrl":"https://doi.org/10.1109/CADS.2010.5623586","url":null,"abstract":"The limited memory space of caches are used in unbalanced way, which generate more conflict misses in some sets while other sets are underutilized. We propose a technique using the tag of address through Hash Functions to address the most underutilized sets in the cache to replace. Therefore, conflict misses in the cache can be resolved by distributing accesses along cache sets evenly. In comparison with the previous techniques that aim to balance cache, this technique doesn't need to consume much energy per access; and less than 2% extra energy will be consumed, in each miss.","PeriodicalId":145317,"journal":{"name":"2010 15th CSI International Symposium on Computer Architecture and Digital Systems","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134026597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Remaining-energy based routing protocol for wireless sensor network","authors":"Millad Ghane, Amir Rajabzadeh","doi":"10.1109/CADS.2010.5623532","DOIUrl":"https://doi.org/10.1109/CADS.2010.5623532","url":null,"abstract":"Energy consumption in sensor network is one of the most important goals in designing a routing protocol. An energy efficient routing protocol will decrease number of transmitted and received packets, because main consumption part in routing is using antennas and signals.","PeriodicalId":145317,"journal":{"name":"2010 15th CSI International Symposium on Computer Architecture and Digital Systems","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123637080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring a low-cost inter-layer communication scheme for 3D networks-on-chip","authors":"A. Rahmani, P. Liljeberg, J. Plosila, H. Tenhunen","doi":"10.1109/CADS.2010.5623588","DOIUrl":"https://doi.org/10.1109/CADS.2010.5623588","url":null,"abstract":"In this paper, a low-cost 3D NoC architecture based on Bidirectional Bisynchronous Vertical Channels (BBVC) is proposed as a solution to mitigate high area footprints of vertical interconnects. Dynamically self-configurable BBVCs, which can transmit flits in either direction, enable a system to benefit from a high-speed bidirectional channel instead of a pair of unidirectional channels for inter-layer communication. In this architecture, low-latency attribute of the interconnect TSVs enables the system to support a higher frequency for vertical channels, better bandwidth utilization, lower area footprint, and improved routability. In addition, an enhanced BBVC-based communication scheme, called Direct Vertical Channel Access, is presented to enable an express inter-layer communication. Experimental results verify that the proposed architecture can reduce up to 47% TSV area footprint with a negligible performance degradation.","PeriodicalId":145317,"journal":{"name":"2010 15th CSI International Symposium on Computer Architecture and Digital Systems","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117304251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CCDA: Correcting control-flow and data errors automatically","authors":"M. Maghsoudloo, N. Khoshavi, H. Zarandi","doi":"10.1109/CADS.2010.5623537","DOIUrl":"https://doi.org/10.1109/CADS.2010.5623537","url":null,"abstract":"This paper presents an efficient software technique to detect and correct control-flow errors through addition of redundant codes in a given program. The key innovation performed in the proposed technique is detection and correction of the control-flow errors using both control-flow graph and data-flow graph. Using this technique, most of control-flow errors in the program are detected first, and next corrected, automatically; so, both errors in the control-flow and program data which is caused by control-flow errors can be corrected. In order to evaluate the proposed technique, a post compiler is used, so that the technique can be applied to every 80×86 binaries, transparently. Three benchmarks quick sort, matrix multiplication and linked list are used, and a total of 5000 transient faults are injected on several executable points in each program. The experimental results demonstrate that at least 93% of the control-flow errors can be detected and corrected by the proposed technique automatically without any data error generation. Moreover, the performance and memory overheads of the technique are noticeably less than traditional techniques.","PeriodicalId":145317,"journal":{"name":"2010 15th CSI International Symposium on Computer Architecture and Digital Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125595890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OTRU: A non-associative and high speed public key cryptosystem","authors":"Ehsan Malekian, A. Zakerolhosseini","doi":"10.1109/CADS.2010.5623536","DOIUrl":"https://doi.org/10.1109/CADS.2010.5623536","url":null,"abstract":"In this paper, we propose OTRU, a high speed probabilistic multi-dimensional public key cryptosystem that encrypts eight data vectors in each encryption round. The underlying algebraic structure of the proposed scheme is the power-associative and alternative octonions algebra which can be defined over any Dedekind domain such as convolution polynomial ring.","PeriodicalId":145317,"journal":{"name":"2010 15th CSI International Symposium on Computer Architecture and Digital Systems","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126583333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Ebrahimi, M. Daneshtalab, P. Liljeberg, H. Tenhunen
{"title":"Performance evaluation of unicast and multicast communication in three-dimensional mesh architectures","authors":"M. Ebrahimi, M. Daneshtalab, P. Liljeberg, H. Tenhunen","doi":"10.1109/CADS.2010.5623591","DOIUrl":"https://doi.org/10.1109/CADS.2010.5623591","url":null,"abstract":"As the multicast communication is utilized commonly in various parallel applications, the performance can be significantly improved by supporting multicast operations at the hardware level. In this paper, we define several factors of efficiency for unicast/multicast communication such as average of unicast latency, average of maximum multicast latency and level of parallelism in 3D mesh NoCs. Then, we propose analytical models for measuring the efficiency factors of a method in unicast/multicast communication called vertical block partitioning.","PeriodicalId":145317,"journal":{"name":"2010 15th CSI International Symposium on Computer Architecture and Digital Systems","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125145282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimization and evaluation of the reconfigurable Grid Alu Processor","authors":"Basher Shehan, Ralf Jahr, S. Uhrig, T. Ungerer","doi":"10.1109/CADS.2010.5623647","DOIUrl":"https://doi.org/10.1109/CADS.2010.5623647","url":null,"abstract":"Currently few architectural approaches propose new paths to raise the performance of conventional sequential instruction streams in the time of the billions transistor era. Many application programs could profit from processors that are able to speed up the execution of sequential applications beyond the performance of current superscalar processors. The Grid Alu Processor (GAP) is a runtime reconfigurable processor designed for the acceleration of a conventional sequential instruction stream without the need of recompilation. It comprises a superscalar processor front-end, a configuration unit, and an array of reconfigurable functional units (FUs), which is fully integrated into the pipeline. The configuration unit maps data dependent and independent instructions simultaneously at runtime into the array of FUs. This paper evaluates the GAP architecture and optimizes the hardware, the number of FUs, and the configuration layers implemented in the array. The simulations show a significant speed up for sequential applications on GAP in comparison to an out-of-order superscalar simulator (SimpleScalar). GAP outperforms SimpleScalar in average by about 50% on the basic architecture and about 100% with an extended version including configuration layers.","PeriodicalId":145317,"journal":{"name":"2010 15th CSI International Symposium on Computer Architecture and Digital Systems","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127025382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Akbar Doostaregan, M. H. Moaiyeri, K. Navi, O. Hashemipour
{"title":"On the design of new low-power CMOS standard ternary logic gates","authors":"Akbar Doostaregan, M. H. Moaiyeri, K. Navi, O. Hashemipour","doi":"10.1109/CADS.2010.5623544","DOIUrl":"https://doi.org/10.1109/CADS.2010.5623544","url":null,"abstract":"A novel low-power and high-performance Standard Ternary Inverter (STI) for CMOS technology is proposed in this paper. This inverter could be used as a fundamental block for designing other ternary basic logic gates. This circuit consists of only MOS transistors and capacitors without any area consuming resistors in its structure. Another great advantage of this design in comparison with the other designs, introduced before, is the elimination of the static power dissipation, which is very important in nano scale CMOS and leads to less power consumption. The proposed design has been simulated, using Synopsys HSPICE tool with 90nm CMOS technology. The simulation results demonstrate the superiority of the presented design with respect to other conventional designs in terms of power consumption and performance.","PeriodicalId":145317,"journal":{"name":"2010 15th CSI International Symposium on Computer Architecture and Digital Systems","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133758698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Rahimi, M. Salehi, S. Mohammadi, S. M. Fakhraie
{"title":"Dynamic voltage scaling for fully asynchronous NoCs using FIFO threshold levels","authors":"A. Rahimi, M. Salehi, S. Mohammadi, S. M. Fakhraie","doi":"10.1109/CADS.2010.5623526","DOIUrl":"https://doi.org/10.1109/CADS.2010.5623526","url":null,"abstract":"In this paper, we propose a dynamic voltage scaling (DVS) policy for a fully asynchronous NoC suitable for low-power yet high-performance architectures. The DVS policy is a FIFO-adaptive DVS, which uses two FIFO threshold levels for decision. It judiciously adjusts switch voltage among only three voltage modes. The introduced architecture is simulated in 90nm CMOS technology with accurate Spice simulations. Experimental results show that the FIFO-adaptive DVS not only lowers the implementation cost, but also achieves another 31% energy-delay saving compared to the DVS policy based on link utilization, in a 90% saturated network.","PeriodicalId":145317,"journal":{"name":"2010 15th CSI International Symposium on Computer Architecture and Digital Systems","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114068087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}