Tianyuan Chen, Lei Chang, Jianqing Ma, Wei Zhang, Feng Gao
{"title":"HOCT: A Highly Scalable Algorithm for Training Linear CRF on Modern Hardware","authors":"Tianyuan Chen, Lei Chang, Jianqing Ma, Wei Zhang, Feng Gao","doi":"10.1109/ICDMW.2009.69","DOIUrl":null,"url":null,"abstract":"This paper proposes an efficient algorithm, HOCT, for CRF training on modern computer architectures. First, software prefetching techniques are utilized to hide cache miss latency. Second, we exploit SIMD to process data in parallel. Third, when dealing with large data sets, we let HOCT instead of operating system to manage swapping operations. Our experiments on various real data sets show that HOCT yields a fourfold speedup when the data can fit in memory, and over a 30-fold speedup when the memory requirement exceeds the physical memory.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"188 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Conference on Data Mining Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2009.69","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper proposes an efficient algorithm, HOCT, for CRF training on modern computer architectures. First, software prefetching techniques are utilized to hide cache miss latency. Second, we exploit SIMD to process data in parallel. Third, when dealing with large data sets, we let HOCT instead of operating system to manage swapping operations. Our experiments on various real data sets show that HOCT yields a fourfold speedup when the data can fit in memory, and over a 30-fold speedup when the memory requirement exceeds the physical memory.