Towards a software product line of trie-based collections

Proceedings of the 2016 ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences Pub Date : 2016-10-20 DOI:10.1145/2993236.2993251

M. Steindorfer, J. Vinju

{"title":"Towards a software product line of trie-based collections","authors":"M. Steindorfer, J. Vinju","doi":"10.1145/2993236.2993251","DOIUrl":null,"url":null,"abstract":"Collection data structures in standard libraries of programming languages are designed to excel for the average case by carefully balancing memory footprint and runtime performance. These implicit design decisions and hard-coded trade-offs do constrain users from using an optimal variant for a given problem. Although a wide range of specialized collections is available for the Java Virtual Machine (JVM), they introduce yet another dependency and complicate user adoption by requiring specific Application Program Interfaces (APIs) incompatible with the standard library. A product line for collection data structures would relieve library designers from optimizing for the general case. Furthermore, a product line allows evolving the potentially large code base of a collection family efficiently. The challenge is to find a small core framework for collection data structures which covers all variations without exhaustively listing them, while supporting good performance at the same time. We claim that the concept of Array Mapped Tries (AMTs) embodies a high degree of commonality in the sub-domain of immutable collection data structures. AMTs are flexible enough to cover most of the variability, while minimizing code bloat in the generator and the generated code. We implemented a Data Structure Code Generator (DSCG) that emits immutable collections based on an AMT skeleton foundation. The generated data structures outperform competitive hand-optimized implementations, and the generator still allows for customization towards specific workloads.","PeriodicalId":405898,"journal":{"name":"Proceedings of the 2016 ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2993236.2993251","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

Abstract

Collection data structures in standard libraries of programming languages are designed to excel for the average case by carefully balancing memory footprint and runtime performance. These implicit design decisions and hard-coded trade-offs do constrain users from using an optimal variant for a given problem. Although a wide range of specialized collections is available for the Java Virtual Machine (JVM), they introduce yet another dependency and complicate user adoption by requiring specific Application Program Interfaces (APIs) incompatible with the standard library. A product line for collection data structures would relieve library designers from optimizing for the general case. Furthermore, a product line allows evolving the potentially large code base of a collection family efficiently. The challenge is to find a small core framework for collection data structures which covers all variations without exhaustively listing them, while supporting good performance at the same time. We claim that the concept of Array Mapped Tries (AMTs) embodies a high degree of commonality in the sub-domain of immutable collection data structures. AMTs are flexible enough to cover most of the variability, while minimizing code bloat in the generator and the generated code. We implemented a Data Structure Code Generator (DSCG) that emits immutable collections based on an AMT skeleton foundation. The generated data structures outperform competitive hand-optimized implementations, and the generator still allows for customization towards specific workloads.

查看原文本刊更多论文

朝着以试用为基础的软件产品系列发展

编程语言的标准库中的集合数据结构被设计为通过仔细平衡内存占用和运行时性能来适应一般情况。这些隐式的设计决策和硬编码权衡确实限制了用户对给定问题使用最优变体。尽管针对Java虚拟机(JVM)有广泛的专用集合，但它们引入了另一种依赖性，并且由于需要与标准库不兼容的特定应用程序编程接口(api)而使用户采用复杂化。针对集合数据结构的产品线将使库设计人员不必针对一般情况进行优化。此外，产品线允许有效地发展集合族潜在的大型代码库。我们面临的挑战是为集合数据结构找到一个小的核心框架，它既能涵盖所有的变量，又不需要详尽地列出它们，同时又能支持良好的性能。我们认为数组映射尝试(amt)的概念在不可变集合数据结构的子领域中体现了高度的通用性。amt足够灵活，可以覆盖大多数可变性，同时最小化生成器和生成代码中的代码膨胀。我们实现了一个数据结构代码生成器(Data Structure Code Generator, DSCG)，它基于AMT骨架基础发出不可变集合。生成的数据结构优于竞争性的手工优化实现，并且生成器仍然允许针对特定工作负载进行定制。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2016 ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences

自引率

0.00%

发文量