{"title":"Multiprocessor Architectures Specialized for Multi-agent Simulation","authors":"Christian Schäck, W. Heenes, R. Hoffmann","doi":"10.1109/IC-NC.2010.34","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.34","url":null,"abstract":"Two new multiprocessor architectures to accelerate the simulation of multi-agent worlds based on the massively parallel GCA (Global Cellular Automata) model are presented. The GCA model is suited to describe and simulate different multi-agent worlds. The designed and implemented architectures mainly consist of a set of processors (NIOS II) and a network. The multiprocessor systems allow the implementation in a flexible way through programming, thus simulating different behaviors on the same architecture. Two architectures with up to 16 processors were implemented on an FPGA. The first architecture uses hardware hash functions in order to reduces the overall simulation time, but lacks scalability. The second architecture uses an agent memory and a cell field memory. This improves the scalability and further increases the performance.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121021988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Adaptive Timeout Strategy for Profiling UDP Flows","authors":"Jing Cai, Zhibin Zhang, P. Zhang, Xinbo Song","doi":"10.1109/IC-NC.2010.15","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.15","url":null,"abstract":"With the increase of network bandwidth, more and more new applications such as audio, video and online games have become the main body in network traffic. Based on real time considerations, these new applications mostly use UDP as transport layer protocol, which directly increase UDP traffic. However, traditional studies believe that TCP dominates the Internet traffic and previous traffic measurements were generally based on it while UDP was ignored. In view of this, we mainly discuss the adaptive timeout strategy of UDP flows in this paper. Firstly, due to the significant differences in flow characteristics between the TCP flows and UDP flows, we expound and prove that the existing adaptive timeout strategies are not appropriate for UDP flows. Secondly, we present our adaptive strategy using Support Vector Machine techniques. We build six classifiers to accurately predict its corresponding maximum packet inter-arrival time and adapt its timeout value within the flow duration. Limited to its accurate rating, we present another concept of adjust accuracy rating which can probability-guaranteed(90%,95%,98%) to avoid long flow to be cut into short flows. The experiment result reveals that our adaptive strategy has the potential to achieve significant performance advantages over other widely used fixed and other adaptive timeout schemes.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"697 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114896501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Evaluation on Sensor Network Technologies for AMI Associated Mudslide Warning System","authors":"Cheng-Jen Tang, Miau-Ru Dai","doi":"10.1109/IC-NC.2010.10","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.10","url":null,"abstract":"In order to detect occurrences of mudslides, a mudslide warning system has to address three major issues: sensor sensitivity, coverage area, and deployment cost. With the emergence of AMI, which is considered as the fundamental step towards Smart Grid, a mudslide warning system is able to utilize AMI communication network to provide a broad coverage at a relatively low deployment cost. However, realization of the sensor networks for this AMI associated mudslide warning system needs to satisfy many constraints. In addition to the design factors identified by previous studies, such as fault tolerance, scalability, cost, hardware, topology change, environment, and power consumption, AMI brings limitations that come along with the electricity grid infrastructure. This paper studies the state of the art of current communication technologies in sensor networks, and identifies which of them meet the requirements of AMI associated mudslide warning system.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114083963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementation of SIVARM: A Simple VMM for the ARM Architecture","authors":"A. Suzuki, S. Oikawa","doi":"10.1109/IC-NC.2010.23","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.23","url":null,"abstract":"By using Virutal Machine Monitors (VMMs)cite{vmm}, we can overcome many issues in embedded systems. The performance gain of recent hardware enables the use of VMM even in small embedded systems. We implemented a VMM for the ARM architecture that is the most widely used CPU for embedded systems. We name it SIVARM: a simple VMM for the ARM architecture. Since the VMM executes in privileged mode and its guest OS executes in non-privileged mode, the VMM can catch the execution of sensitive instructions as exceptions and emulate them appropriately. The guest OS can execute in non-privileged mode thanks to the virtual banked registers and the virtual processor mode provided by the VMM. Domains are used to control the access between the guest OS and the VMM. The VMM was implemented for the ARM926EJ-S processor, and can successfully boot the Linux on it.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124010025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Design of On-the-Fly Virtual Channel Allocation for Low Cost High Performance On-Chip Routers","authors":"S. Nguyen, S. Oyanagi","doi":"10.1109/IC-NC.2010.25","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.25","url":null,"abstract":"Network-on-Chip (NoC) is an important communication infrastructure for System-on-Chips (SoCs). Designing high performance NoCs with minimized area overhead is becoming a major technical challenge. In this paper, we propose the on-the-fly virtual channel (VC) allocation for low cost high performance on-chip routers. By performing the VC allocation based on the result of switch allocation, the dependency between VC allocation and switch traversal is removed and these stages can be performed in parallel. In this manner, the pipeline of a packet transfer can be shortened in a non-speculative fashion. We have implemented the proposed router on FPGA and evaluated in terms of communication latency, throughput and hardware amount. The experimental results show that, the proposed router with on-the-fly VC allocation reduces the communication latency by 27.3%, and improves throughput by 21.4% as compared to the conventional VC router. In comparison with the look-ahead speculative router, it improves the throughput by 6.2% with 17.6% reduction of area for control logic.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131036995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and Implementation of a Uniform Platform to Support Multigenerational GPU Architectures for High Performance Stream-Based Computing","authors":"S. Yamagiwa, Masahiro Arai, K. Wada","doi":"10.1109/IC-NC.2010.35","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.35","url":null,"abstract":"GPU-based computing has become one of the popular high performance computing fields. The field is called GPGPU. This paper is focused on design and implementation of a uniform GPGPU application that is optimized for both the legacy and the recent GPU architectures. As a typical example of such the GPGPU application, this paper will discuss the uniform implementation of the Caravel a platform. Especially the flow-model execution mechanism will be considered referring the recent GPU architectures. To verify the design and the implementation, this paper will evaluate the compatibility among the architectures, and also test measurements of performance.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128200204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Loop Performance Improvement for Min-cut Program Decomposition Method","authors":"K. Ootsu, Takeshi Abe, T. Yokota, T. Baba","doi":"10.1109/IC-NC.2010.47","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.47","url":null,"abstract":"In recent years, speedup by the thread level parallel processing becomes more and more important with the spread of multi-core processors, and various techniques for parallelizing the single thread code into the efficient multithreaded code that can achieve efficient thread level parallel processing have been developed. The speculative multithreading is an important technology for achieving the high performance by the thread level parallel processing. To improve the execution performance by speculative multithreading, it is necessary to appropriately decompose the program code. Against the background of this, T. A. Johnson, et al. proposed a program decomposition technique (hereafter, we refer to their method as min-cut method) for finding the decomposition pattern that can minimize the effects of the performance degradation factors, by finding the minimum cut set in the weighted control flow graph (CFG) of the program. The min-cut method is wide coverage and a very promising technique since the whole program can be decomposed without being restricted to the logical structures, such as loop. However, the min-cut method has a problem that the loop level parallelism cannot be utilized enough. In this paper, we propose an improved method for the min-cut method to enhance the loop execution performance. Our method is based on the min-cut method and tries to apply the loop unrolling to the loop in the target program codes during the process of the decomposition. We apply our method to the practical program codes selected from the SPEC CINT2000 benchmarks. The results show that the loops, that are not decomposed with the min-cut method, are decomposed, and the possibilities of making use of the loop level parallelism increased. In addition, the results of the performance evaluation by using a cycle-based simulator show that our method can improve the loop execution performance, as compared to the min-cut method.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128797178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design an Implementation of Bee Hive in a Mult-agent Based Resource Discovery Method in P2P Systems","authors":"Junpei Yamasaki, Y. Kambayashi","doi":"10.1109/IC-NC.2010.20","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.20","url":null,"abstract":"We have proposed and implemented an efficient resource locating method in a pure P2P system based on a multiple agent system. All the resources as well as resource information are managed by cooperative multiple agents. In order to optimize the behaviors of cooperative multiple agents, we now utilized a honey bee algorithm that guides mobile agents to migrate toward the nodes that are possible to have the requested resources. In this paper, we report on our implementation.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122759998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cisse Ahmadou Dit Adi, Hiroki Matsutani, M. Koibuchi, H. Irie, T. Miyoshi, T. Yoshinaga
{"title":"An Efficient Path Setup for a Photonic Network-on-Chip","authors":"Cisse Ahmadou Dit Adi, Hiroki Matsutani, M. Koibuchi, H. Irie, T. Miyoshi, T. Yoshinaga","doi":"10.1109/IC-NC.2010.31","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.31","url":null,"abstract":"Electrical Network-on-Chip (NoC) faces critical challenges in meeting the high performance and low power consumption requirements for future multicore processors interconnection. Recent tremendous advances in CMOS compatible optical components give the potential for photonics to deliver an efficient NoC performance at an acceptable energy cost. However, the lack of in ¿ight processing and buffering of optical data made the realization of a fully optical NoC complicated. A hybrid architecture which uses optical high bandwidth transfer and a tiny electrical control network can take advantage of both interconnection methods to offer an efficient performance-per-watt infrastructure to connect multicore processors and System-on-Chip (SoC). In this paper, we propose a hybrid photonic torus NoC (HPNoC) that uses a predictive switching to improve the performance of a hybrid architecture. By using prediction techniques, we can reduce the path set up latency for the electrical control network hence improving the overall end-to-end delay for communication in the HPNoC. Simulation results using a cycle accurate simulator under uniform, neighbor and bit reversal traffic patterns for 64 nodes show that predictive switching considerably improves the HPNoC overall performance.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123133443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimization Vector Quantization by Adaptive Associative-Memory-Based Codebook Learning in Combination with Huffman Coding","authors":"A. Kawabata, T. Koide, H. Mattausch","doi":"10.1109/IC-NC.2010.38","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.38","url":null,"abstract":"In the presented research on codebook optimization for vector quantization, an associative memory architecture is applied, which searches the most similar data among previously stored reference data. For realizing the learning function of new codebook data, a learning algorithm is implemented, which is based on this associative memory and which imitates the concept of the human short/long-term memory. The quality improvement of the codebook for vector quantization, created with the proposed learning algorithm, and the learning-parameter dependence of the improvement is evaluated with the Peak Signal Noise Ratio (PSNR), which is an index of the image quality. A quantitative PSNR improvement of 2.5 – 3.0 dB could be verified. Since the learning algorithm orders the codebook elements according to their usage frequency for the vector-quantization process, Huffman coding is additionally applied, and is verified to further improve the compression ratio from 12.8 to 14.1.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133740065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}