大数据计算的多核系统

Many-Core Computing: Hardware and Software Pub Date : 2019-06-03 DOI:10.1049/pbpc022e_ch21

S. Ainsworth, Timothy M. Jones

{"title":"大数据计算的多核系统","authors":"S. Ainsworth, Timothy M. Jones","doi":"10.1049/pbpc022e_ch21","DOIUrl":null,"url":null,"abstract":"In many ways, big data should be the poster-child of many-core computing. By necessity, such applications typically scale extremely well across machines, featuring high levels of thread-level parallelism. Programming techniques, such as Google's MapReduce, have allowed many applications running in the data centre to be programmed with parallelism directly in mind and have enabled extremely high throughput across machines. We explore the state-of-the-art in terms of techniques used to make many-core architectures work for big-data workloads. We explore how tail-latency concerns mean that even though workloads are parallel, high performance is still necessary in at least some parts of the system. We take a look at how memory-system issues can cause some big-data applications to scale less favourably than we would like for many-core architectures. We examine the programming models used for big-data workloads and consider how these both help and hinder the typically complex mapping seen elsewhere for many-core architectures. And we also take a look at the alternatives to traditional many-core systems in exploiting parallelism for efficiency in the big-data space.","PeriodicalId":254920,"journal":{"name":"Many-Core Computing: Hardware and Software","volume":"02 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Many-core systems for big-data computing\",\"authors\":\"S. Ainsworth, Timothy M. Jones\",\"doi\":\"10.1049/pbpc022e_ch21\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In many ways, big data should be the poster-child of many-core computing. By necessity, such applications typically scale extremely well across machines, featuring high levels of thread-level parallelism. Programming techniques, such as Google's MapReduce, have allowed many applications running in the data centre to be programmed with parallelism directly in mind and have enabled extremely high throughput across machines. We explore the state-of-the-art in terms of techniques used to make many-core architectures work for big-data workloads. We explore how tail-latency concerns mean that even though workloads are parallel, high performance is still necessary in at least some parts of the system. We take a look at how memory-system issues can cause some big-data applications to scale less favourably than we would like for many-core architectures. We examine the programming models used for big-data workloads and consider how these both help and hinder the typically complex mapping seen elsewhere for many-core architectures. And we also take a look at the alternatives to traditional many-core systems in exploiting parallelism for efficiency in the big-data space.\",\"PeriodicalId\":254920,\"journal\":{\"name\":\"Many-Core Computing: Hardware and Software\",\"volume\":\"02 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Many-Core Computing: Hardware and Software\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1049/pbpc022e_ch21\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Many-Core Computing: Hardware and Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1049/pbpc022e_ch21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在很多方面，大数据应该是多核计算的典范。这样的应用程序通常需要在机器之间很好地扩展，具有高级别的线程级并行性。编程技术，如谷歌的MapReduce，已经允许在数据中心运行的许多应用程序直接考虑并行性，并实现了机器间极高的吞吐量。我们探索了用于使多核架构适用于大数据工作负载的最新技术。我们将探讨尾延迟问题如何意味着即使工作负载是并行的，至少在系统的某些部分仍然需要高性能。我们来看看内存系统问题是如何导致一些大数据应用程序的伸缩性不如我们希望的多核架构。我们将研究用于大数据工作负载的编程模型，并考虑这些模型如何帮助和阻碍多核架构中常见的典型复杂映射。我们还将探讨在大数据空间中利用并行性提高效率的传统多核系统的替代方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Many-core systems for big-data computing

In many ways, big data should be the poster-child of many-core computing. By necessity, such applications typically scale extremely well across machines, featuring high levels of thread-level parallelism. Programming techniques, such as Google's MapReduce, have allowed many applications running in the data centre to be programmed with parallelism directly in mind and have enabled extremely high throughput across machines. We explore the state-of-the-art in terms of techniques used to make many-core architectures work for big-data workloads. We explore how tail-latency concerns mean that even though workloads are parallel, high performance is still necessary in at least some parts of the system. We take a look at how memory-system issues can cause some big-data applications to scale less favourably than we would like for many-core architectures. We examine the programming models used for big-data workloads and consider how these both help and hinder the typically complex mapping seen elsewhere for many-core architectures. And we also take a look at the alternatives to traditional many-core systems in exploiting parallelism for efficiency in the big-data space.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Many-Core Computing: Hardware and Software

自引率

0.00%

发文量