Muhammad Mukaram Khan, J. Navaridas, Alexander D. Rast, Xin Jin, L. Plana, M. Luján, J. V. Woods, J. Miguel-Alonso, S. Furber
{"title":"Event-Driven Configuration of a Neural Network CMP System over a Homogeneous Interconnect Fabric","authors":"Muhammad Mukaram Khan, J. Navaridas, Alexander D. Rast, Xin Jin, L. Plana, M. Luján, J. V. Woods, J. Miguel-Alonso, S. Furber","doi":"10.1109/ISPDC.2009.25","DOIUrl":null,"url":null,"abstract":"Configuring a million-core parallel system at boot time is a difficult process when the system has neither specialised hardware support for the configuration process nor a preconfigured default state that puts it in operating condition. SpiNNaker is a parallel Chip Multiprocessor (CMP) system for neural network (NN) simulation. Where most large CMP systems feature a sideband network to complete the boot process, SpiNNaker has a single homogeneous network interconnect for both application inter-processor communications and system control functions such as boot load and run-time user-system interaction. This network improves fault tolerance and makes it easier to support dynamic run-time reconfiguration, however, it requires a boot process that is transaction-level compatible with the application’s communications model. Since SpiNNaker uses event-driven asynchronous communications throughout, theloader operates with purely local control: there is no global synchronisation, state information, or transition sequence. A novel two-stage “unfolding” boot-up process efficiently configures the SpiNNaker hardware and loads the application using a high-speed flood-fill technique with support for run-time re-configuration. SystemC simulation of a multi-CMP SpiNNaker system indicates an error-free CMP configuration time of 1.3ms, while a high-level simulation of a full-scale system (64K CMPs) indicates a mean application-loading time of ∼20ms (for a 100KB application), which is virtually independent of the sizeof the system. We verified the CMP configuration process with hardware-level Verilog simulation.","PeriodicalId":226126,"journal":{"name":"2009 Eighth International Symposium on Parallel and Distributed Computing","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Eighth International Symposium on Parallel and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPDC.2009.25","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
Configuring a million-core parallel system at boot time is a difficult process when the system has neither specialised hardware support for the configuration process nor a preconfigured default state that puts it in operating condition. SpiNNaker is a parallel Chip Multiprocessor (CMP) system for neural network (NN) simulation. Where most large CMP systems feature a sideband network to complete the boot process, SpiNNaker has a single homogeneous network interconnect for both application inter-processor communications and system control functions such as boot load and run-time user-system interaction. This network improves fault tolerance and makes it easier to support dynamic run-time reconfiguration, however, it requires a boot process that is transaction-level compatible with the application’s communications model. Since SpiNNaker uses event-driven asynchronous communications throughout, theloader operates with purely local control: there is no global synchronisation, state information, or transition sequence. A novel two-stage “unfolding” boot-up process efficiently configures the SpiNNaker hardware and loads the application using a high-speed flood-fill technique with support for run-time re-configuration. SystemC simulation of a multi-CMP SpiNNaker system indicates an error-free CMP configuration time of 1.3ms, while a high-level simulation of a full-scale system (64K CMPs) indicates a mean application-loading time of ∼20ms (for a 100KB application), which is virtually independent of the sizeof the system. We verified the CMP configuration process with hardware-level Verilog simulation.