{"title":"大数据计算的多核系统","authors":"S. Ainsworth, Timothy M. Jones","doi":"10.1049/pbpc022e_ch21","DOIUrl":null,"url":null,"abstract":"In many ways, big data should be the poster-child of many-core computing. By necessity, such applications typically scale extremely well across machines, featuring high levels of thread-level parallelism. Programming techniques, such as Google's MapReduce, have allowed many applications running in the data centre to be programmed with parallelism directly in mind and have enabled extremely high throughput across machines. We explore the state-of-the-art in terms of techniques used to make many-core architectures work for big-data workloads. We explore how tail-latency concerns mean that even though workloads are parallel, high performance is still necessary in at least some parts of the system. We take a look at how memory-system issues can cause some big-data applications to scale less favourably than we would like for many-core architectures. We examine the programming models used for big-data workloads and consider how these both help and hinder the typically complex mapping seen elsewhere for many-core architectures. And we also take a look at the alternatives to traditional many-core systems in exploiting parallelism for efficiency in the big-data space.","PeriodicalId":254920,"journal":{"name":"Many-Core Computing: Hardware and Software","volume":"02 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Many-core systems for big-data computing\",\"authors\":\"S. Ainsworth, Timothy M. Jones\",\"doi\":\"10.1049/pbpc022e_ch21\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In many ways, big data should be the poster-child of many-core computing. By necessity, such applications typically scale extremely well across machines, featuring high levels of thread-level parallelism. Programming techniques, such as Google's MapReduce, have allowed many applications running in the data centre to be programmed with parallelism directly in mind and have enabled extremely high throughput across machines. We explore the state-of-the-art in terms of techniques used to make many-core architectures work for big-data workloads. We explore how tail-latency concerns mean that even though workloads are parallel, high performance is still necessary in at least some parts of the system. We take a look at how memory-system issues can cause some big-data applications to scale less favourably than we would like for many-core architectures. We examine the programming models used for big-data workloads and consider how these both help and hinder the typically complex mapping seen elsewhere for many-core architectures. And we also take a look at the alternatives to traditional many-core systems in exploiting parallelism for efficiency in the big-data space.\",\"PeriodicalId\":254920,\"journal\":{\"name\":\"Many-Core Computing: Hardware and Software\",\"volume\":\"02 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Many-Core Computing: Hardware and Software\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1049/pbpc022e_ch21\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Many-Core Computing: Hardware and Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1049/pbpc022e_ch21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In many ways, big data should be the poster-child of many-core computing. By necessity, such applications typically scale extremely well across machines, featuring high levels of thread-level parallelism. Programming techniques, such as Google's MapReduce, have allowed many applications running in the data centre to be programmed with parallelism directly in mind and have enabled extremely high throughput across machines. We explore the state-of-the-art in terms of techniques used to make many-core architectures work for big-data workloads. We explore how tail-latency concerns mean that even though workloads are parallel, high performance is still necessary in at least some parts of the system. We take a look at how memory-system issues can cause some big-data applications to scale less favourably than we would like for many-core architectures. We examine the programming models used for big-data workloads and consider how these both help and hinder the typically complex mapping seen elsewhere for many-core architectures. And we also take a look at the alternatives to traditional many-core systems in exploiting parallelism for efficiency in the big-data space.