{"title":"一个快速公正评估自动机处理硬件的框架","authors":"Xiaodong Yu, Kaixi Hou, Hao Wang, Wu-chun Feng","doi":"10.1109/IISWC.2017.8167767","DOIUrl":null,"url":null,"abstract":"Programming Micron's Automata Processor (AP) requires expertise in both automata theory and the AP architecture, as programmers have to manually manipulate state transition elements (STEs) and their transitions with a low-level Automata Network Markup Language (ANML). When the required STEs of an application exceed the hardware capacity, multiple reconfigurations are needed. However, most previous AP-based designs limit the dataset size to fit into a single AP board and simply neglect the costly overhead of reconfiguration. This results in unfair performance comparisons between the AP and other processors. To address this issue, we propose a framework for the fast and fair evaluation of AP devices. Our framework provides a hierarchical approach that automatically generates automata for large datasets through user-defined paradigms and allows the use of cascadable macros to achieve highly optimized reconfigurations. We highlight the importance of counting the configuration time in the overall AP performance, which in turn, can provide better insight into identifying essential hardware features, specifically for large-scale problem sizes. Our framework shows that the AP can achieve up to 461x overall speedup fairly compared to CPU counterparts.","PeriodicalId":110094,"journal":{"name":"2017 IEEE International Symposium on Workload Characterization (IISWC)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A framework for fast and fair evaluation of automata processing hardware\",\"authors\":\"Xiaodong Yu, Kaixi Hou, Hao Wang, Wu-chun Feng\",\"doi\":\"10.1109/IISWC.2017.8167767\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Programming Micron's Automata Processor (AP) requires expertise in both automata theory and the AP architecture, as programmers have to manually manipulate state transition elements (STEs) and their transitions with a low-level Automata Network Markup Language (ANML). When the required STEs of an application exceed the hardware capacity, multiple reconfigurations are needed. However, most previous AP-based designs limit the dataset size to fit into a single AP board and simply neglect the costly overhead of reconfiguration. This results in unfair performance comparisons between the AP and other processors. To address this issue, we propose a framework for the fast and fair evaluation of AP devices. Our framework provides a hierarchical approach that automatically generates automata for large datasets through user-defined paradigms and allows the use of cascadable macros to achieve highly optimized reconfigurations. We highlight the importance of counting the configuration time in the overall AP performance, which in turn, can provide better insight into identifying essential hardware features, specifically for large-scale problem sizes. Our framework shows that the AP can achieve up to 461x overall speedup fairly compared to CPU counterparts.\",\"PeriodicalId\":110094,\"journal\":{\"name\":\"2017 IEEE International Symposium on Workload Characterization (IISWC)\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Symposium on Workload Characterization (IISWC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IISWC.2017.8167767\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Symposium on Workload Characterization (IISWC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISWC.2017.8167767","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A framework for fast and fair evaluation of automata processing hardware
Programming Micron's Automata Processor (AP) requires expertise in both automata theory and the AP architecture, as programmers have to manually manipulate state transition elements (STEs) and their transitions with a low-level Automata Network Markup Language (ANML). When the required STEs of an application exceed the hardware capacity, multiple reconfigurations are needed. However, most previous AP-based designs limit the dataset size to fit into a single AP board and simply neglect the costly overhead of reconfiguration. This results in unfair performance comparisons between the AP and other processors. To address this issue, we propose a framework for the fast and fair evaluation of AP devices. Our framework provides a hierarchical approach that automatically generates automata for large datasets through user-defined paradigms and allows the use of cascadable macros to achieve highly optimized reconfigurations. We highlight the importance of counting the configuration time in the overall AP performance, which in turn, can provide better insight into identifying essential hardware features, specifically for large-scale problem sizes. Our framework shows that the AP can achieve up to 461x overall speedup fairly compared to CPU counterparts.