Tianyuan Chen, Lei Chang, Jianqing Ma, Wei Zhang, Feng Gao
{"title":"HOCT:一种在现代硬件上训练线性CRF的高度可扩展算法","authors":"Tianyuan Chen, Lei Chang, Jianqing Ma, Wei Zhang, Feng Gao","doi":"10.1109/ICDMW.2009.69","DOIUrl":null,"url":null,"abstract":"This paper proposes an efficient algorithm, HOCT, for CRF training on modern computer architectures. First, software prefetching techniques are utilized to hide cache miss latency. Second, we exploit SIMD to process data in parallel. Third, when dealing with large data sets, we let HOCT instead of operating system to manage swapping operations. Our experiments on various real data sets show that HOCT yields a fourfold speedup when the data can fit in memory, and over a 30-fold speedup when the memory requirement exceeds the physical memory.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"188 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HOCT: A Highly Scalable Algorithm for Training Linear CRF on Modern Hardware\",\"authors\":\"Tianyuan Chen, Lei Chang, Jianqing Ma, Wei Zhang, Feng Gao\",\"doi\":\"10.1109/ICDMW.2009.69\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes an efficient algorithm, HOCT, for CRF training on modern computer architectures. First, software prefetching techniques are utilized to hide cache miss latency. Second, we exploit SIMD to process data in parallel. Third, when dealing with large data sets, we let HOCT instead of operating system to manage swapping operations. Our experiments on various real data sets show that HOCT yields a fourfold speedup when the data can fit in memory, and over a 30-fold speedup when the memory requirement exceeds the physical memory.\",\"PeriodicalId\":351078,\"journal\":{\"name\":\"2009 IEEE International Conference on Data Mining Workshops\",\"volume\":\"188 5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE International Conference on Data Mining Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW.2009.69\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Conference on Data Mining Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2009.69","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
HOCT: A Highly Scalable Algorithm for Training Linear CRF on Modern Hardware
This paper proposes an efficient algorithm, HOCT, for CRF training on modern computer architectures. First, software prefetching techniques are utilized to hide cache miss latency. Second, we exploit SIMD to process data in parallel. Third, when dealing with large data sets, we let HOCT instead of operating system to manage swapping operations. Our experiments on various real data sets show that HOCT yields a fourfold speedup when the data can fit in memory, and over a 30-fold speedup when the memory requirement exceeds the physical memory.