IEEE Transactions on Multi-Scale Computing Systems最新文献_第4页

Body Bias Control for Renewable Energy Source with a High Inner Resistance 高内阻可再生能源的体偏控制

IEEE Transactions on Multi-Scale Computing Systems Pub Date : 2018-04-17 DOI: 10.1109/TMSCS.2018.2827980

Keita Azegami;Hayate Okuhara;Hideharu Amano

引用次数: 1

An Adjacent-Line-Merging Writeback Scheme for STT-RAM-Based Last-Level Caches 一种基于STT RAM的末级缓存的邻行合并写回方案

IEEE Transactions on Multi-Scale Computing Systems Pub Date : 2018-04-17 DOI: 10.1109/TMSCS.2018.2827955

Masayuki Sato;Yoshiki Shoji;Zentaro Sakai;Ryusuke Egawa;Hiroaki Kobayashi

{"title":"An Adjacent-Line-Merging Writeback Scheme for STT-RAM-Based Last-Level Caches","authors":"Masayuki Sato;Yoshiki Shoji;Zentaro Sakai;Ryusuke Egawa;Hiroaki Kobayashi","doi":"10.1109/TMSCS.2018.2827955","DOIUrl":"https://doi.org/10.1109/TMSCS.2018.2827955","url":null,"abstract":"Spin-Transfer Torque RAM (STT-RAM) has attracted attention as a key element for the Last-Level Cache (LLC) of a future microprocessor. Since STT-RAM has a higher density than SRAM and non-volatility, STT-RAM can contribute to building the cache memory with a larger capacity and a less static energy. However, since STT-RAM changes its magnetization state in the case when storing data, the energy cost of write access requests for an STT-RAM LLC is more expensive than that of an SRAM LLC. As a result, the total energy consumption of the STT-RAM LLC for write-intensive applications may increase. To solve this problem, this paper proposes an Adjacent-Line-Merging Writeback Scheme. Since a larger cache line of an STT-RAM cache can contribute to the reduction in the write energy cost per byte, the upper-level cache merges two adjacent small lines to one large line, and then writes the merged line back to the STT-RAM LLC. Moreover, the larger line size for the LLC leads to a reduction in the static energy cost. The evaluation results show that the proposed scheme can reduce the energy consumption of the STT-RAM LLC by up to 26, and 9.3 percent on average.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"4 4","pages":"593-604"},"PeriodicalIF":0.0,"publicationDate":"2018-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2018.2827955","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68025496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A 120 fps High Frame Rate Real-time HEVC Video Encoder with Parallel Configuration Scalable to 4K 可扩展到4K的并行配置的120fps高帧率实时HEVC视频编码器

IEEE Transactions on Multi-Scale Computing Systems Pub Date : 2018-04-10 DOI: 10.1109/TMSCS.2018.2825320

Yuya Omori;Takayuki Onishi;Hiroe Iwasaki;Atsushi Shimizu

{"title":"A 120 fps High Frame Rate Real-time HEVC Video Encoder with Parallel Configuration Scalable to 4K","authors":"Yuya Omori;Takayuki Onishi;Hiroe Iwasaki;Atsushi Shimizu","doi":"10.1109/TMSCS.2018.2825320","DOIUrl":"https://doi.org/10.1109/TMSCS.2018.2825320","url":null,"abstract":"This paper describes a new 120 fps (frames per second) real-time HEVC (High Efficiency Video Coding) encoder for HFR (high frame rate) video encoding and transmission. HFR provides more immersive viewing experience features by solving the problems created by fast moving scenes. Temporally scalable encoding with backward compatibility for legacy non-HFR systems is suitable for the rapid spread of HFR content delivery, avoiding the need to distribute multiple bitstreams of the same video with different frame rates. Such temporal scalability requires flexible encoder control functionalities to support newly-customized reference picture structures and dual-stream bitrate control. In this paper, modification in the customizable software architecture of encoder LSIs makes it possible to achieve 120 fps temporally scalable HEVC encoding for existing 60 fps-based systems. The encoder also achieves \u0000<inline-formula><tex-math>${4mathrm{K}/ 120;mathrm{fps}}$</tex-math></inline-formula>\u0000 video encoding in real time through the synchronized operation of multiple \u0000<inline-formula><tex-math>${2mathrm{K}/ 120;mathrm{fps}}$</tex-math></inline-formula>\u0000 encoders working in parallel. Our evaluations show that the bitrate increase rate from 60 fps to 120 fps under the same objective image quality condition are within the range of less than 57.2 percent in all video sequences and its average value is 53.8 percent. Both values are lower than that of the HM (HEVC reference software). The proposed encoder systems will open the door to the next generation high frame rate UHDTV (ultra high definition television) services.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"4 4","pages":"491-499"},"PeriodicalIF":0.0,"publicationDate":"2018-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2018.2825320","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68023994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Module Placement under Completion-Time Uncertainty in Micro-Electrode-Dot-Array Digital Microfluidic Biochips 微电极点阵列数字微流控芯片在完成时间不确定条件下的模块放置

IEEE Transactions on Multi-Scale Computing Systems Pub Date : 2018-04-04 DOI: 10.1109/TMSCS.2018.2822799

Wen-Chun Chung;Pei-Yi Cheng;Zipeng Li;Tsung-Yi Ho

{"title":"Module Placement under Completion-Time Uncertainty in Micro-Electrode-Dot-Array Digital Microfluidic Biochips","authors":"Wen-Chun Chung;Pei-Yi Cheng;Zipeng Li;Tsung-Yi Ho","doi":"10.1109/TMSCS.2018.2822799","DOIUrl":"https://doi.org/10.1109/TMSCS.2018.2822799","url":null,"abstract":"Digital microfluidic biochips (DMFBs) are an emerging technology that are replacing traditional laboratory procedures. With the integrated functions which are necessary for biochemical experiments, DMFBs are able to achieve automatic experiments. Recently, DMFBs based on a new architecture called micro-electrode-dot-array (MEDA) have been demonstrated. Compared with conventional DMFBs which sensors are specifically located, each microelectrode is integrated with a sensor on MEDA-based biochips. Benefiting from the advantage of MEDA-based biochips, real-time reaction-outcome detection is attainable. However, to the best of our knowledge, synthesis algorithms proposed in the literature for MEDA-based biochips do not fully utilize the real-time detection since completion-time uncertainties have not yet been considered. During the execution of a biochemical experiment, operations may finish earlier or delay due to variability and randomness in biochemical reactions. Such uncertainties also have effects when allocating modules for each fluidic operation and placing them on a biochip since a biochip with a fixed size area restricts the number and the size of these modules. Thus, in this paper, we proposed the first operation-variation-aware placement algorithm that fully utilizes the real-time detection since completion-time uncertainties have been considered. Simulation results demonstrate that with the proposed approach, it leads to reduced time-to-result and minimizes the chip size while not exceeding completion time compared to the benchmarks.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"4 4","pages":"811-821"},"PeriodicalIF":0.0,"publicationDate":"2018-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2018.2822799","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68023996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

A Hierarchical Inference Model for Internet-of-Things 物联网的层次推理模型

IEEE Transactions on Multi-Scale Computing Systems Pub Date : 2018-03-30 DOI: 10.1109/TMSCS.2018.2821154

Hongxu Yin;Zeyu Wang;Niraj K. Jha

{"title":"A Hierarchical Inference Model for Internet-of-Things","authors":"Hongxu Yin;Zeyu Wang;Niraj K. Jha","doi":"10.1109/TMSCS.2018.2821154","DOIUrl":"https://doi.org/10.1109/TMSCS.2018.2821154","url":null,"abstract":"Internet-of-Things (IoT) has connected billions of devices to the Internet. These devices are already collecting zettabytes (\u0000<inline-formula><tex-math>$10^{21}$</tex-math></inline-formula>\u0000) of data. However, the current IoT framework suffers from limited sensor energy, communication bandwidth, and server storage. These limitations impede the ability to send all the sensor data to the server all the time. Compact smart sensors provide a way to address this challenge. As opposed to the conventional sense-and-transmit sensors, emerging smart sensors can collect data, extract features, derive local inferences, and transmit only inference outcomes and possibly some raw data associated with rare events instead of all the raw data. This can dramatically cut down on the amount of sensor data transmitted, and hence its communication energy and network traffic. However, edge or server inference models trained with conventional machine learning approaches do not account for the fact that the smart sensors in the system have already performed a local inference. These approaches need all the sensor data and hence only cater to the traditional sense-and-transmit paradigm. This undoes the energy benefits brought about by smart sensors. In this paper, we propose a hierarchical inference model for IoT applications based on hierarchical learning and local inferences. Our model is able to take advantage of inference already performed on smart sensors, while at the same time accommodating conventional sense-and-transmit sensors in the IoT system. It also generalizes sensor-level inference to inference at other edge nodes by exploiting the intrinsically sensor/edge-grouped IoT data structure. We train classifiers hierarchically, aligned with the sensor-edge-server IoT paradigm. We verify our approach with seven IoT applications, demonstrating that the model is accurate, efficient, and generally applicable. We derive four edge-level inference models and four server-level inference models for these applications. For the four edge-level inference models, we reduce the number of bits transmitted from the sensor by \u0000<inline-formula><tex-math>$3.2times$ </tex-math></inline-formula>\u0000- \u0000<inline-formula><tex-math>$42.7times$</tex-math></inline-formula>\u0000 while at the same time also improving the classification accuracy by 0.3-6.7 percent. For the four server-level inference models, we reduce the number of edge-to-server bits transmitted by \u0000<inline-formula><tex-math>$17times$</tex-math> </inline-formula>\u0000-\u0000<inline-formula> <tex-math>$60times$</tex-math></inline-formula>\u0000, with classification accuracy change in the \u0000<inline-formula><tex-math> $-0.4$</tex-math></inline-formula>\u0000- \u0000<inline-formula><tex-math>$+0.1$</tex-math></inline-formula>\u0000 percent range.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"4 3","pages":"260-271"},"PeriodicalIF":0.0,"publicationDate":"2018-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2018.2821154","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68023986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Design, Evaluation and Application of Approximate High-Radix Dividers 近似高基除法器的设计、评价及应用

IEEE Transactions on Multi-Scale Computing Systems Pub Date : 2018-03-22 DOI: 10.1109/TMSCS.2018.2817608

Linbin Chen;Jie Han;Weiqiang Liu;Paolo Montuschi;Fabrizio Lombardi

{"title":"Design, Evaluation and Application of Approximate High-Radix Dividers","authors":"Linbin Chen;Jie Han;Weiqiang Liu;Paolo Montuschi;Fabrizio Lombardi","doi":"10.1109/TMSCS.2018.2817608","DOIUrl":"https://doi.org/10.1109/TMSCS.2018.2817608","url":null,"abstract":"Approximate high radix dividers (HR-AXDs) are proposed and investigated in this paper. High-radix division is reviewed and inexact computing is introduced at different levels. Design parameters such as number of bits (N) and radix (r) are considered in the analysis; the replacement of exact cells with inexact cells in a binary signed-digit adder is introduced by utilizing different replacement schemes. Cell truncation and error compensation are also proposed to further extend inexact computation. Circuit-level performance and the error characteristics of the inexact high radix dividers are analyzed for the proposed designs. The combined assessment of the normal error distance, power dissipation, and delay is investigated and applications of approximate high-radix dividers are treated in detail. The simulation results show that the proposed approximate dividers offer extensive saving in terms of power dissipation, circuit complexity, and delay, while only incurring in a small degradation in accuracy thus making them possibly suitable and interesting to some applications and domains such as low power/mobile computing.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"4 3","pages":"299-312"},"PeriodicalIF":0.0,"publicationDate":"2018-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2018.2817608","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68023987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Guest Editorial: Special Issue on Accelerated Computing 客座编辑：加速计算特刊

IEEE Transactions on Multi-Scale Computing Systems Pub Date : 2018-03-20 DOI: 10.1109/TMSCS.2018.2807058

Aviral Shrivastava;Fadi J. Kurdahi

引用次数: 0

2017 Index IEEE Transactions on Multi-Scale Computing Systems Vol. 3 2017年索引IEEE多尺度计算系统汇刊第3卷

IEEE Transactions on Multi-Scale Computing Systems Pub Date : 2018-03-20 DOI: 10.1109/TMSCS.2017.2788365

引用次数: 0

2018 Reviewers List 2018年评审人名单

IEEE Transactions on Multi-Scale Computing Systems Pub Date : 2018-03-20 DOI: 10.1109/TMSCS.2018.2810358

引用次数: 0

Exploring a SOT-MRAM Based In-Memory Computing for Data Processing 探索一种用于数据处理的基于SOT-MRAM的内存计算

IEEE Transactions on Multi-Scale Computing Systems Pub Date : 2018-03-17 DOI: 10.1109/TMSCS.2018.2836967

Zhezhi He;Yang Zhang;Shaahin Angizi;Boqing Gong;Deliang Fan

{"title":"Exploring a SOT-MRAM Based In-Memory Computing for Data Processing","authors":"Zhezhi He;Yang Zhang;Shaahin Angizi;Boqing Gong;Deliang Fan","doi":"10.1109/TMSCS.2018.2836967","DOIUrl":"https://doi.org/10.1109/TMSCS.2018.2836967","url":null,"abstract":"In this paper, we propose a Spin-Orbit Torque Magnetic Random-Access Memory (SOT-MRAM) array design that can simultaneously work as non-volatile memory and implement a reconfigurable in-memory logic operation without add-on logic circuits. The computed output can be simply read out like a typical MRAM bit-cell through the modified peripheral circuit. Such intrinsic in-memory computation can be used to process data locally and transfer the “cooked” data to the primary processing unit (i.e., CPU or GPU) for complex computations with high precision requirement. It greatly reduces the power-hungry and long-distance data communication, and further leads to extreme parallel computation within memory. In this work, we further propose an in-memory edge extraction algorithm as a case study to demonstrate the efficiency of the in-memory pre-processing methodology. The simulation results show that our edge extraction method reduces data communication as much as 8x for grayscale image, thus greatly reducing system energy consumption. Meanwhile, the F-measure result shows only \u0000<inline-formula><tex-math>$sim$</tex-math></inline-formula>\u000010 percent degradation compared to conventional edge detection operator, such as Prewitt, Sobel, and Roberts. Moreover, the edges extracted from the memory show comparable good quality with Canny edges in the context of edge-based motion detection and cross-modality object recognition.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"4 4","pages":"676-685"},"PeriodicalIF":0.0,"publicationDate":"2018-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2018.2836967","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68024195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23