Brian Chin, D. V. Dincklage, V. Ercegovac, Peter Hawkins, Mark S. Miller, F. Och, Christopher Olston, Fernando C Pereira
{"title":"Yedalog: Exploring Knowledge at Scale","authors":"Brian Chin, D. V. Dincklage, V. Ercegovac, Peter Hawkins, Mark S. Miller, F. Och, Christopher Olston, Fernando C Pereira","doi":"10.4230/LIPIcs.SNAPL.2015.63","DOIUrl":null,"url":null,"abstract":"With huge progress on data processing frameworks, human programmers are frequently the bottleneck when analyzing large repositories of data. We introduce Yedalog, a declarative programming language that allows programmers to mix data-parallel pipelines and computation seamlessly in a single language. By contrast, most existing tools for data-parallel computation embed a sublanguage of data-parallel pipelines in a general-purpose language, or vice versa. Yedalog extends Datalog, incorporating not only computational features from logic programming, but also features for working with data structured as nested records. Yedalog programs can run both on a single machine, and distributed across a cluster in batch and interactive modes, allowing programmers to mix dierent modes of execution easily. 1998 ACM Subject Classification D.3.2 Data-flow languages, Constraint and Logic Languages","PeriodicalId":231548,"journal":{"name":"Summit on Advances in Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Summit on Advances in Programming Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.SNAPL.2015.63","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33
Abstract
With huge progress on data processing frameworks, human programmers are frequently the bottleneck when analyzing large repositories of data. We introduce Yedalog, a declarative programming language that allows programmers to mix data-parallel pipelines and computation seamlessly in a single language. By contrast, most existing tools for data-parallel computation embed a sublanguage of data-parallel pipelines in a general-purpose language, or vice versa. Yedalog extends Datalog, incorporating not only computational features from logic programming, but also features for working with data structured as nested records. Yedalog programs can run both on a single machine, and distributed across a cluster in batch and interactive modes, allowing programmers to mix dierent modes of execution easily. 1998 ACM Subject Classification D.3.2 Data-flow languages, Constraint and Logic Languages