Data and model convergence: a case for software defined architectures

Proceedings of the 16th ACM International Conference on Computing Frontiers Pub Date : 2019-04-30 DOI:10.1145/3310273.3323438

Antonino Tumeo

{"title":"Data and model convergence: a case for software defined architectures","authors":"Antonino Tumeo","doi":"10.1145/3310273.3323438","DOIUrl":null,"url":null,"abstract":"High Performance Computing, data analytics, and machine learning are often considered three separate and different approaches. Applications, software and now hardware stacks are typically designed to only address one of the areas at a time. This creates a false distinction across the three different areas. In reality, domain scientists need to exercise all the three approaches in an integrated way. For example, large scale simulations generate enormous amount of data, to which Big Data Analytics techniques can be applied. Or, as scientist seek to use data analytics as well as simulation for discovery, machine learning can play an important role in making sense of the disparate source's information. Pacific Northwest National Laboratory is launching a new Laboratory Directed Research and Development (LDRD) Initiative to investigate the integration of the three techniques at all level of the high-performance computing stack, the Data-Model Convergence (DMC) Initiative. The DMC Initiative aims to increase scientist productivity by enabling purpose-built software and hardware and domain-aware ML techniques. In this talk, I will present the objectives of PNNL's DMC Initiative, highlighting the research that will be performed to enable the integration of vastly different programming paradigms and mental models. I will then make the case for how reconfigurable architectures could represent a great opportunity to address the challenges of DMC. In principle, the possibility to dynamically modify the architecture during runtime could provide a way to address the requirement of workloads that have significantly diverse behaviors across phases, without losing too much flexibility or programmer productivity, with respect to highly heterogeneous architectures composed by sea of fixed application specific accelerators. Reconfigurable architectures have been explored since long time ago, and arguably new software breakthroughs are required to make them successful. I will thus present the efforts that the DMC initiative is launching to design a productive toolchain for upcoming novel reconfigurable systems.","PeriodicalId":431860,"journal":{"name":"Proceedings of the 16th ACM International Conference on Computing Frontiers","volume":"575 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM International Conference on Computing Frontiers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3310273.3323438","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

High Performance Computing, data analytics, and machine learning are often considered three separate and different approaches. Applications, software and now hardware stacks are typically designed to only address one of the areas at a time. This creates a false distinction across the three different areas. In reality, domain scientists need to exercise all the three approaches in an integrated way. For example, large scale simulations generate enormous amount of data, to which Big Data Analytics techniques can be applied. Or, as scientist seek to use data analytics as well as simulation for discovery, machine learning can play an important role in making sense of the disparate source's information. Pacific Northwest National Laboratory is launching a new Laboratory Directed Research and Development (LDRD) Initiative to investigate the integration of the three techniques at all level of the high-performance computing stack, the Data-Model Convergence (DMC) Initiative. The DMC Initiative aims to increase scientist productivity by enabling purpose-built software and hardware and domain-aware ML techniques. In this talk, I will present the objectives of PNNL's DMC Initiative, highlighting the research that will be performed to enable the integration of vastly different programming paradigms and mental models. I will then make the case for how reconfigurable architectures could represent a great opportunity to address the challenges of DMC. In principle, the possibility to dynamically modify the architecture during runtime could provide a way to address the requirement of workloads that have significantly diverse behaviors across phases, without losing too much flexibility or programmer productivity, with respect to highly heterogeneous architectures composed by sea of fixed application specific accelerators. Reconfigurable architectures have been explored since long time ago, and arguably new software breakthroughs are required to make them successful. I will thus present the efforts that the DMC initiative is launching to design a productive toolchain for upcoming novel reconfigurable systems.

查看原文本刊更多论文

数据和模型融合:软件定义架构的一个案例

高性能计算、数据分析和机器学习通常被认为是三种不同的方法。应用程序、软件和现在的硬件堆栈通常被设计为一次只处理一个领域。这在三个不同的领域造成了错误的区分。实际上，领域科学家需要综合运用这三种方法。例如，大规模模拟产生大量数据，大数据分析技术可以应用于这些数据。或者，当科学家试图使用数据分析和模拟来发现时，机器学习可以在理解不同来源的信息方面发挥重要作用。太平洋西北国家实验室正在启动一项新的实验室指导研究与开发(LDRD)计划，以研究在高性能计算堆栈的所有级别上集成三种技术，即数据模型融合(DMC)计划。DMC计划旨在通过启用专用软件和硬件以及领域感知ML技术来提高科学家的生产力。在这次演讲中，我将介绍PNNL DMC计划的目标，重点介绍将进行的研究，以实现不同编程范式和心智模型的集成。然后，我将说明可重构架构如何代表一个解决DMC挑战的巨大机会。原则上，在运行时期间动态修改体系结构的可能性可以提供一种方法来解决跨阶段具有明显不同行为的工作负载的需求，而不会失去太多的灵活性或程序员的生产力，相对于由大量固定的应用程序特定加速器组成的高度异构的体系结构。可重构架构很久以前就已经被探索过了，并且可以说，要使它们成功，需要新的软件突破。因此，我将介绍DMC计划为即将到来的新型可重构系统设计高效工具链所做的努力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 16th ACM International Conference on Computing Frontiers

自引率

0.00%

发文量