Mohammadmehdi Shammasi, M. Baharloo, Meisam Abdollahi, A. Baniasadi
{"title":"Turn-aware Application Mapping using Reinforcement Learning in Power Gating-enabled Network on Chip","authors":"Mohammadmehdi Shammasi, M. Baharloo, Meisam Abdollahi, A. Baniasadi","doi":"10.1109/MCSoC57363.2022.00061","DOIUrl":"https://doi.org/10.1109/MCSoC57363.2022.00061","url":null,"abstract":"As the backbone for many-core chips, Network-on-chips (NoCs) consume a significant share of total chip power. As a result, decreasing the power consumption in these components can reduce the total chip's power significantly. NoC's routers can be powered down using power-gating, a promising technique for reducing static power consumption. In some advanced methods, routers are put in sleep mode and only wake up when they are needed to turn/inject packets. Since waking up the router takes several cycles to complete, packets will experience high latency. In this regard, application mapping significantly impacts the number of turns. This article proposes a reinforcement learning (RL) framework based on Actor-Critic architecture to optimize the application mapping problem to minimize the number of turn packets as well as communication cost. Our RL framework learns the heuristic of the mapping problem and outputs a near-optimal mapping. A 2-opt local search algorithm fine-tunes this strategy and provides an improved mapping. Our simulations show that the proposed RL framework can achieve better cost and algorithm run-time performance compared to other heuristic algorithms such as Simulated Annealing (SA) and Genetic Algorithm (GA).","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133171099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalability of Post-Silicon Test Generation for Multi-core RISC-V SOC Validation","authors":"Sih Pin Tan, Yung It Ho","doi":"10.1109/MCSoC57363.2022.00012","DOIUrl":"https://doi.org/10.1109/MCSoC57363.2022.00012","url":null,"abstract":"Instruction Stream Generators (ISG) are an important tool in post-silicon validation of modern CPU s, including RISC- V CPU designs. ISG test generation needs to accurately model the target instruction set and is a compute-intensive process typically performed on off-platform generator machines. With the rising CPU core count and ISA complexity on modern SOC designs, ISG requires a corresponding increase in compute capacity with every generation. The trendline shows that simply throwing more compute at the problem is untenable from a cost perspective. This paper studies several alternative approaches to address this scalability problem. First, it is shown that test generation throughput increases in a nonlinear fashion to an increase in target core count and test length. A correlation curve can be plotted by characterizing test generation throughput against target configurations to identify a sweet spot for test footprint. The study then explores how selective replacement of randomized instruction sequences with fixed routines can improve test generation performance, with careful consideration needed to mitigate any potential loss of validation quality. It then discusses how such test optimizations can even be strategized to improve validation coverage.","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132596173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raman Maurya, T. Teo, Shi Hui Chua, Hwang-Cherng Chow, I-Chyn Wey
{"title":"Complex Human Activities Recognition Based on High Performance 1D CNN Model","authors":"Raman Maurya, T. Teo, Shi Hui Chua, Hwang-Cherng Chow, I-Chyn Wey","doi":"10.1109/MCSoC57363.2022.00059","DOIUrl":"https://doi.org/10.1109/MCSoC57363.2022.00059","url":null,"abstract":"Human activity recognition (HAR) is an emerging scientific research field that has wide area of applications in different fields such as healthcare, social-sciences and human-computer interaction etc. In many cases, humans perform very complex physical activities that needs to be tracked in order to improve well-being, quality of life and health. In this study, a method for complex HAR based on One dimensional (1D) CNN model using tri-axis accelerometer sensor data was proposed. The sensor data was collected from a smartwatch for three complex human activities which are studying, playing games and mobile scrolling. 1D CNNs provides high accuracy as well as less computational complexity in performing HAR. The proposed 1D CNN model was trained and optimized on a self-prepared dataset in Python. The adapted model provides an accuracy of 98.28 %. A preliminary study shows that the proposed model could effectively recognize the intended activities as a baseline for extending future work in the HAR area.","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133800435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. K. Maurya, Anubhav Shivhare, Aadil Ali, Satakshi, Ashutosh Mishra, Manish Kumar
{"title":"Cluster Based Smart Random Walk for Data Aggregation in Wireless Sensor Network","authors":"M. K. Maurya, Anubhav Shivhare, Aadil Ali, Satakshi, Ashutosh Mishra, Manish Kumar","doi":"10.1109/MCSoC57363.2022.00025","DOIUrl":"https://doi.org/10.1109/MCSoC57363.2022.00025","url":null,"abstract":"Wireless Sensor Network (WSN) is a collection of sensor devices having limited communication range and battery power. It has unreliable wireless medium for data transmission hence reliable message delivery is a challenging task for routing protocols designed for WSN. In multiple scenarios a Random Walk (RW) in WSNs is proved to be energy efficient for routing and load balancing. Further clustering methodologies provide a novel idea of multi-hop data gathering and aggregation which significantly reduces redundant data transmission. The present work explores the integration of RW and clustering schemes with an objective to improve upon existing routing protocols in terms reliable data transmission and network lifetime. Hence, a hybrid Cluster-Based Smart Random Walk (CBSRW) routing technique for data collection and aggregation is proposed in this paper. It has two phases: Clustering & data aggregation and CBSRW. Experimental results and comparative analysis show that the proposed work achieves an enhanced lifetime and efficient packet delivery ratio compared to the conventional state of the art schemes designed for the domain.","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115754784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Md. Faizul Ibne Amin, Md. Mostafizer Rahman, Y. Watanobe, Muepu Mukendi Daniel
{"title":"Impact of Programming Language Skills in Programming Learning","authors":"Md. Faizul Ibne Amin, Md. Mostafizer Rahman, Y. Watanobe, Muepu Mukendi Daniel","doi":"10.1109/MCSoC57363.2022.00050","DOIUrl":"https://doi.org/10.1109/MCSoC57363.2022.00050","url":null,"abstract":"In this modern era of the internet and information technology, a mentionable amount of data is generated from different sources consistently which refers to big data. This huge amount of data not only draws great attention for further research but also helps to extract different knowledge and infor-mation in various areas. The Information and Communication Technology (ICT) area is not apart from that, as the huge amount of data in this area is enhancing opportunities for further research and development. In the ICT, most of the courses especially programming-related courses are designed to improve practical skills. With the increasing demand for software engineering and other related fields, programming education or learning plays a vital role. However, in programming learning, the impact of programming language is also important to enrich the programming and technical skill. This paper aims to analyze the impact of programming language skills in programming learning by collecting real-world data from a programming course. In this paper, we used a dataset from submission logs of a programming course in an Online Judge (OJ) system. We selected the users randomly and considered single and multiple languages used for acceptance. Finally, we have presented the analysis of the overall acceptance rate, single and multiple languages used acceptance rate, and compared them. Moreover, the analytical result of this paper can help students and programmers as well as the improvement programming learning.","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114640144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Charge-Digital Hybrid Compute-In-Memory Macro with full precision 8-bit Multiply-Accumulation for Edge Computing Devices","authors":"Jinwu Chen, Tianzhu Xiong, Xin Si","doi":"10.1109/MCSoC57363.2022.00033","DOIUrl":"https://doi.org/10.1109/MCSoC57363.2022.00033","url":null,"abstract":"Compute-in-memory (CIM) is emerging as a new computing architecture to overcome the high energy consumption of edge-side AI and IoT devices. When performing high-precision neural network calculations, analog CIM and digital CIM have their own advantages and disadvantages. In this paper, we combine the advantages of high energy efficiency of analog CIM and high accuracy of digital CIM to propose a charge-digital hybrid CIM (CDH-CIM) macro. By placing the high bits in the digital domain and the low bits in the charge domain, the multiply-accumulation (MAC) operation of 8b input activations (lAs) and 8b weights is achieved with no precision loss. The proposed CDH-CIM macro is designed using 22nm FDSOI CMOS process. Simulation shows that the macro achieves 6.98~11.0 TOPS/W at 0.8V and 71.92% inference accuracy when performing CIFAR-100 dataset.","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131051259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhenbin Lv, X. Wang, Haifeng Zhi, Bo Lin, Yitao Shen, Yanyan Wang, Chenxu Wang
{"title":"Design and implementation of vehicle oil online information monitoring system","authors":"Zhenbin Lv, X. Wang, Haifeng Zhi, Bo Lin, Yitao Shen, Yanyan Wang, Chenxu Wang","doi":"10.1109/MCSoC57363.2022.00032","DOIUrl":"https://doi.org/10.1109/MCSoC57363.2022.00032","url":null,"abstract":"With the popularization and application of vehicles, the safety of automobiles has gradually attracted widespread attention, among which the quality of oil is an important factor affecting vehicle safety. Nowadays, the monitoring and evaluation of oil is divided into offline and online ways. This paper is aimed at the application of obtaining and analyzing oil quality in real time for vehicles. And based on the online monitoring method, this paper designed and implemented a vehicle oil information monitoring system. In the system, hardware acquisition part consists of oil sensors and self-made STM32 board, and host computer monitoring software as an analysis display part. After experimental testing, the system can conduct online monitoring, early warning and evaluation of the dielectric constant, viscosity, water content, water activity, density and temperature indicators of oil, and can predict the soot content and diesel content with the given model, and can calculate the $100^{circ}mathrm{C}$ kinematic viscosity using the least squares fitting algorithm. The accuracy and measurement range of the system depend on the indicators of the oil sensors. This vehicle oil information monitoring system is conducive to collecting the actual parameters of vehicle oil operation and providing a reliable basis for vehicle safety maintenance.","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122971975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cognitive Bus Coding Scheme for Inter-Chip Communications of Deep Learning Accelerator Chiplet on Low-cost Si and Glass Interposer","authors":"Yu-Hong Chang, Tourangbam Harishore Singh, Po-Tsang Huang","doi":"10.1109/MCSoC57363.2022.00044","DOIUrl":"https://doi.org/10.1109/MCSoC57363.2022.00044","url":null,"abstract":"In the present Artificial Intelligence (AI) hardware research, interposer based multi-chip Deep Learning Accelerator (DLA) system is one of the main technology. Silicon (Si) interposer is the main key in the emerging 2.5D integration process. However, signal integrity is limited by the capacitive crosstalk and signal reflection can lead to notch attack in some frequency bands. In this paper, two new bus coding schemes are proposed to improve signal integrity, reducing the crosstalk to increase bandwidth for on-silicon-interposer and on-glass-interposer inter-chip data communications. For silicon interposer, a joint code division multiple access and crosstalk avoidance coding (Joint CDMA/CAC) scheme is proposed to reduce the capacitive crosstalk effect for fine-pitch interconnects. The eye diagram and bit error rate are both improved, and the average crosstalk effect is reduced by half. Also, a cognitive bus coding scheme is proposed by spread spectrum and channel learning for glass interposer. The proposed cognitive bus coding increases the total data bandwidth under frequency notches based on the channel condition for modulation.","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120995187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Buffer Allocation for Exposed Datapath Architectures","authors":"Anoop Bhagyanath, K. Schneider","doi":"10.1109/MCSoC57363.2022.00013","DOIUrl":"https://doi.org/10.1109/MCSoC57363.2022.00013","url":null,"abstract":"Concurrent access to a given number of registers limits the instruction-level parallelism (ILP) used by conventional processors despite the use of many processing units (PUs). Many recent architectures expose their internal datapaths to compilers, allowing the compiler to move intermediate values from program execution directly between PUs, thus bypassing the use of registers. Buffered exposed datapath (BED) architectures additionally implement these inter-PU communication paths with scalable first-in-first-out (FIFO) buffers to avoid the use of registers and to prevent unnecessary synchronization between PUs. However, the BED compiler must ensure that the creation order of intermediate values in a buffer matches their consumption order so that the next executing instructions always find their operands at the heads of the corresponding buffers. In this paper, we present a novel buffer interference analysis that determines a criterion for allocating multiple program variables to the same buffer based on a given instruction schedule that specifies an access order for those variables. We then use the well-known dataflow analysis framework to compute a buffer interference graph whose coloring yields a valid buffer allocation for programs by considering the instructions in the given order. Preliminary experimental results show the effectiveness of our code generation approach compared to traditional register-based compilation. More importantly, the buffer interference graph should serve as the basis for future buffer allocation schemes that maximize ILP usage.","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121658071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A survey of main dataflow MoCCs for CPS design and verification","authors":"Guillaume Roumage, S. Azaiez, Stéphane Louise","doi":"10.1109/MCSoC57363.2022.00010","DOIUrl":"https://doi.org/10.1109/MCSoC57363.2022.00010","url":null,"abstract":"The automotive industry has recently emphasized reducing the number of Electronic Control Units (ECUs) installed in vehicles for economic and ecological reasons. This reduction means that the design and verification must be independent of the vehicle's final choice of (MC)SoCs, knowing they will evolve as time passes. To that end, dataflow Models of Computation and Communication (MoCCs) are powerful tools for maintaining this independence. A subclass of dataflow MoCCs -deterministic dataflow MoCCs- is of particular interest since it allows designers to derive safety and security properties at compile-time. This work proposes a short survey of the existing deterministic dataflow MoCCs. We describe the properties of each dataflow MoCC and present an expressiveness hierarchy of dataflow MoCCs adjustable to designers' needs.","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131647050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}