2016 Fourth International Symposium on Computing and Networking (CANDAR)最新文献

筛选
英文 中文
A Cost and Performance Analytical Model for Large-Scale On-Chip Interconnection Networks 大规模片上互连网络的成本与性能分析模型
2016 Fourth International Symposium on Computing and Networking (CANDAR) Pub Date : 2016-11-01 DOI: 10.1109/CANDAR.2016.0083
Takanori Kurihara, Yamin Li
{"title":"A Cost and Performance Analytical Model for Large-Scale On-Chip Interconnection Networks","authors":"Takanori Kurihara, Yamin Li","doi":"10.1109/CANDAR.2016.0083","DOIUrl":"https://doi.org/10.1109/CANDAR.2016.0083","url":null,"abstract":"As an interconnection topology, two-dimensional mesh is widely used in the design of the network-on-chip (NoC) for integrating dozens of cores on a VLSI chip because of its very simple structure and ease of on-chip implementation. However, as the progress of IC technology, it becomes possible to integrate a large-scale system on a chip that contains more than one thousand processing elements or cores. In such a case, mesh topology will deteriorate performance due to the increase of communication time among cores. This paper investigates topologies and IC layout schemes of mesh, torus, hypercube, and metacube for achieving good cost-performance tradeoffs. We propose an analytical model for evaluating cost-performance ratio by considering NoC's topology and layout. The model is parameterized with node degree, graph diameter, the number of routers, the router complexity, the bandwidth of the connection for the router, the number of processing cores, the total length of links, and the cost ratios of the link section and the router section. This model is helpful for us to find out the optimal topology and layout for NoC with a given network size. It was found that when the network size is small, mesh has a better cost-performance than others; as the network size increases, torus and hypercube outperform mesh; and metacube has the best cost-performance among them.","PeriodicalId":322499,"journal":{"name":"2016 Fourth International Symposium on Computing and Networking (CANDAR)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128614829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Semantic Dataflow Logger Connecting Java Objects and Database Rows and Columns 连接Java对象和数据库行、列的语义数据流记录器
2016 Fourth International Symposium on Computing and Networking (CANDAR) Pub Date : 2016-11-01 DOI: 10.1109/CANDAR.2016.0027
Toshio Ito, Y. Kaneko
{"title":"A Semantic Dataflow Logger Connecting Java Objects and Database Rows and Columns","authors":"Toshio Ito, Y. Kaneko","doi":"10.1109/CANDAR.2016.0027","DOIUrl":"https://doi.org/10.1109/CANDAR.2016.0027","url":null,"abstract":"As computer systems become more complicated, monitoring dataflows in a system becomes important for maintaining its performance. However, because conventional methods of dataflow monitoring are either too fine-grained or too coarse-grained, it is difficult to analyze application-specific performance metrics. In this paper, we propose a dataflow logger with suitable granularity for performance analysis. Our logger is implemented as a Java library, which tracks two types of dataflows: dataflows between objects inside a Java program, and dataflows between a Java object and a row and column in a relational database. That way, our logger can produce dataflow logs with rich semantics about the application's data model. We conduct an experiment with an example system and demonstrate that we can obtain dataflow logs useful for performance analysis. We also conduct detailed overhead analysis of our logger. Although our logger slows down the example system 13 times, we figure out major sources of the overhead. We argue possible solutions to the overhead.","PeriodicalId":322499,"journal":{"name":"2016 Fourth International Symposium on Computing and Networking (CANDAR)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124617449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Polling-Based P2P File Sharing with High Success Rate and Low Communication Cost 基于轮询的P2P文件共享,成功率高,通信成本低
2016 Fourth International Symposium on Computing and Networking (CANDAR) Pub Date : 2016-11-01 DOI: 10.1109/CANDAR.2016.0060
Kouhei Ootani, S. Fujita
{"title":"Polling-Based P2P File Sharing with High Success Rate and Low Communication Cost","authors":"Kouhei Ootani, S. Fujita","doi":"10.1109/CANDAR.2016.0060","DOIUrl":"https://doi.org/10.1109/CANDAR.2016.0060","url":null,"abstract":"This paper proposes a polling-based consistency maintenance scheme for the Peer-to-Peer (P2P) file sharing of editable contents. The proposed scheme achieves a high success rate of the acquisition of the latest copy of shared files with low communication cost. In the following we first show that when several peers acquire a copy of shared files from the same replica peer, the minimum success rate is achieved by the peer with the maximum query rate regardless of the polling and the update intervals. We then design a distributed algorithm to maintain the correspondence between client and replica peers to minimize the average polling rate while keeping the average success rate to a designated value. The performance of the proposed algorithm is evaluated by simulation.","PeriodicalId":322499,"journal":{"name":"2016 Fourth International Symposium on Computing and Networking (CANDAR)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124630877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Topology-Aware Data Aggregation for High Performance Collective MPI-IO on a Multi-core Cluster System 基于拓扑感知的多核集群系统高性能MPI-IO数据聚合
2016 Fourth International Symposium on Computing and Networking (CANDAR) Pub Date : 2016-11-01 DOI: 10.1109/CANDAR.2016.0022
Y. Tsujita, A. Hori, Toyohisa Kameyama, Y. Ishikawa
{"title":"Topology-Aware Data Aggregation for High Performance Collective MPI-IO on a Multi-core Cluster System","authors":"Y. Tsujita, A. Hori, Toyohisa Kameyama, Y. Ishikawa","doi":"10.1109/CANDAR.2016.0022","DOIUrl":"https://doi.org/10.1109/CANDAR.2016.0022","url":null,"abstract":"Parallel I/O such as MPI-IO is one of the performance improvement solutions in parallel computing using MPI. ROMIO is a widely used MPI-IO implementation which addresses to improve collective I/O performance by using its optimization named two-phase I/O. File I/O task is given to a subset of or all of MPI processes, which are called aggregators. Multiple CPUs or CPU cores give a chance to increase computing power by deploying multiple MPI processes per compute node, while such deployment leads to poor I/O performance due to ROMIO's topology-unaware aggregator layout. In our previous work, optimized aggregator layout which was suitable for striping accesses on a Lustre file system improved I/O performance, however, its unbalanced communication load due to unawareness in MPI rank layout among compute nodes led to ineffective data aggregation. To address minimization in data aggregation time for further I/O performance improvements, we introduce a topology-aware data aggregation scheme which takes care of MPI rank layout across compute nodes. The proposal arranges data collection sequence by aggregators in order to mitigate network contention. The optimization has achieved up to 67% improvements in I/O performance compared with the original ROMIO in HPIO benchmark runs using 768 processes on 64 compute nodes of the TSUBAME2.5 supercomputer at the Tokyo Institute of Technology. Even if the number of aggregators was half or 1/3 of the total number of processes, the optimization has still kept comparable I/O performance with the maximum performance.","PeriodicalId":322499,"journal":{"name":"2016 Fourth International Symposium on Computing and Networking (CANDAR)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124013540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Last Path Caching: A Simple Way to Remove Redundant Memory Accesses of Path ORAM 最后路径缓存:一种简单的方法来消除冗余的内存访问路径ORAM
2016 Fourth International Symposium on Computing and Networking (CANDAR) Pub Date : 2016-11-01 DOI: 10.1109/CANDAR.2016.0068
Naoki Fujieda, Ryoichi Yamauchi, S. Ichikawa
{"title":"Last Path Caching: A Simple Way to Remove Redundant Memory Accesses of Path ORAM","authors":"Naoki Fujieda, Ryoichi Yamauchi, S. Ichikawa","doi":"10.1109/CANDAR.2016.0068","DOIUrl":"https://doi.org/10.1109/CANDAR.2016.0068","url":null,"abstract":"Oblivious RAM (ORAM) is a technique to hide the access pattern of data to untrusted memory along with their contents. Path ORAM is a recent lightweight ORAM protocol, whose derived access pattern involves some redundancy that can be removed without the loss of security. In this paper, we introduce last path caching, which removes the redundancy of Path ORAM with a simpler protocol than an existing scheme. By combining two caching strategies, our technique showed only 0.2% performance loss from the existing one, while keeping the determinacy of the derived access pattern.","PeriodicalId":322499,"journal":{"name":"2016 Fourth International Symposium on Computing and Networking (CANDAR)","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127041947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The Firing Squad Synchronization Problem on Higher-Dimensional CA with Multiple Updating Cycles 多更新周期高维CA上的行刑队同步问题
2016 Fourth International Symposium on Computing and Networking (CANDAR) Pub Date : 2016-11-01 DOI: 10.1109/CANDAR.2016.0053
L. Manzoni, A. Porreca, H. Umeo
{"title":"The Firing Squad Synchronization Problem on Higher-Dimensional CA with Multiple Updating Cycles","authors":"L. Manzoni, A. Porreca, H. Umeo","doi":"10.1109/CANDAR.2016.0053","DOIUrl":"https://doi.org/10.1109/CANDAR.2016.0053","url":null,"abstract":"Traditional cellular automata (CA) assume the presence of a single global clock regulating the update of all their cells. When this assumption is dropped, cells can update with different speeds, thus increasing the difficulty of solving synchronization problems. Here we solve the traditional and the generalized Firing Squad Synchronization Problem in dimension two and higher on multiple updating cycle CA.","PeriodicalId":322499,"journal":{"name":"2016 Fourth International Symposium on Computing and Networking (CANDAR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129398428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Communication Link Switching Method Based on Destination IP Address for Power Savings 基于目的IP地址的通信链路切换方法
2016 Fourth International Symposium on Computing and Networking (CANDAR) Pub Date : 2016-11-01 DOI: 10.1109/CANDAR.2016.0067
Masato Nishiguchi, S. Kimura
{"title":"Communication Link Switching Method Based on Destination IP Address for Power Savings","authors":"Masato Nishiguchi, S. Kimura","doi":"10.1109/CANDAR.2016.0067","DOIUrl":"https://doi.org/10.1109/CANDAR.2016.0067","url":null,"abstract":"As the number of Internet users increases, network devices are required to achieve power savings. For this purpose, the authors proposed a Gigabit Ethernet link rate switching method based on the destination IP address for typical networks in small offices or at home. However, this method has a problem in that the communication is interrupted for a few seconds when the link rate is switched. To solve the problem, this paper proposes a communication link switching method. In this method, a client is assumed to connect via multiple network interfaces such as Gigabit Ethernet and a wireless LAN to the user's subnet. When a user starts communicating, the method selects one of the interfaces based on the destination IP address. The communication experiments demonstrate that the proposed method has improved power consumption and avoided any communication interruption time compared to our previous method.","PeriodicalId":322499,"journal":{"name":"2016 Fourth International Symposium on Computing and Networking (CANDAR)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127915479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
CPRtree: A Tree-Based Checkpointing Architecture for Heterogeneous FPGA Computing CPRtree:一种基于树的异构FPGA计算检查点架构
2016 Fourth International Symposium on Computing and Networking (CANDAR) Pub Date : 2016-11-01 DOI: 10.1109/CANDAR.2016.0024
H. Vu, S. Kajkamhaeng, Shinya Takamaeda-Yamazaki, Y. Nakashima
{"title":"CPRtree: A Tree-Based Checkpointing Architecture for Heterogeneous FPGA Computing","authors":"H. Vu, S. Kajkamhaeng, Shinya Takamaeda-Yamazaki, Y. Nakashima","doi":"10.1109/CANDAR.2016.0024","DOIUrl":"https://doi.org/10.1109/CANDAR.2016.0024","url":null,"abstract":"FPGAs provide reconfigurability and high performance for parallel applications. Modern FPGAs can be integrated in computing systems as accelerators so that they can combine with host CPU to execute offload applications. This integration puts more pressure on the fault tolerance of computing systems and the question how to improve the dependability becomes crucial. Similar to CPU-based system, checkpoint/restart techniques are expected to be developed and applied to FPGA-based computing systems. There are two issues rising in this situation: how to checkpoint and restart FPGA, and how this checkpoint/restart model works well with the checkpoint/restart model of the whole computing system. In this paper, first we propose a new checkpoint/restart architecture along with a checkpointing mechanism on FPGA. Second, we propose \"fine-grain\" management for checkpointing to reduce performance degradation. Third, we propose a technique to capture consistent snapshots of FPGA and the rest of the computing system. For host software, we also provide CPRtree stack including API functions to manage checkpoint/restart procedures on FPGA. Our experimental results show that the checkpointing architecture causes up to 9.73% maximum clock frequency degradation, small breakdown, and small data footprint, while the LUT overhead varies from 17.98% (Dijkstra) to 160.67% (Matrix Multiplication).","PeriodicalId":322499,"journal":{"name":"2016 Fourth International Symposium on Computing and Networking (CANDAR)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131421325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Service Identification by Packet Inspection Based on N-grams in Multiple Connections 基于N-grams的多连接报文检测服务识别
2016 Fourth International Symposium on Computing and Networking (CANDAR) Pub Date : 2016-11-01 DOI: 10.1109/CANDAR.2016.0123
Masaki Hara, Shinnosuke Nirasawa, A. Nakao, M. Oguchi, Shu Yamamoto, Saneyasu Yamaguchi
{"title":"Service Identification by Packet Inspection Based on N-grams in Multiple Connections","authors":"Masaki Hara, Shinnosuke Nirasawa, A. Nakao, M. Oguchi, Shu Yamamoto, Saneyasu Yamaguchi","doi":"10.1109/CANDAR.2016.0123","DOIUrl":"https://doi.org/10.1109/CANDAR.2016.0123","url":null,"abstract":"Identifying the service of traffic by given IP network flows is essential for various purposes, such as management of QoS and avoiding security issues. Typical methods for this are identification based on its IP addresses and port numbers. However, the achieved accuracies of these method are not sufficient, then improving these methods is required. Deep Packet Inspection (DPI) is one of the most effective methods for improving accuracy of identification. In this paper, we explore a method for identifying the service of flow. We propose an identifying method based on DPI which covers multiple connections in a service. Then, we present performance evaluation and demonstrate that our method can suitably identify service from given network flows.","PeriodicalId":322499,"journal":{"name":"2016 Fourth International Symposium on Computing and Networking (CANDAR)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131867667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
The Importance of Dynamic Load Balancing among OpenMP Thread Teams for Irregular Workloads OpenMP线程组在不规则工作负载下动态负载平衡的重要性
2016 Fourth International Symposium on Computing and Networking (CANDAR) Pub Date : 2016-11-01 DOI: 10.1109/CANDAR.2016.0097
Xiong Xiao, S. Hirasawa, H. Takizawa, Hiroaki Kobayashi
{"title":"The Importance of Dynamic Load Balancing among OpenMP Thread Teams for Irregular Workloads","authors":"Xiong Xiao, S. Hirasawa, H. Takizawa, Hiroaki Kobayashi","doi":"10.1109/CANDAR.2016.0097","DOIUrl":"https://doi.org/10.1109/CANDAR.2016.0097","url":null,"abstract":"Recently, massively-parallel many-core processors such as Intel Xeon Phi coprocessors have attracted researchers' attentions because various applications are significantly accelerated with those processors. In the field of high-performance computing, OpenMP is a standard programming model commonly used to parallelize a kernel loop for many-core processors. For hierarchical parallel processing, OpenMP version 4.0 or later allows programmers to group threads into multiple thread teams. In this paper, we first show the performance gain of using multiple thread teams even for one many-core processor. Then, we demonstrate that dynamic load balancing among those thread teams has a potential of significantly improving the performance of irregular workloads on a many-core processor. Although the current OpenMP specification does not offer such a dynamic load balancing mechanism, we discuss possible benefits of dynamic load balancing among thread teams through experiments using the Intel Xeon Phi coprocessor.","PeriodicalId":322499,"journal":{"name":"2016 Fourth International Symposium on Computing and Networking (CANDAR)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123500144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信