寻找可重用的机器学习组件来构建编程语言处理管道

European Conference on Software Architecture Pub Date : 2022-08-11 DOI:10.48550/arXiv.2208.05596

Patrick Flynn, T. Vanderbruggen, C. Liao, Pei-Hung Lin, M. Emani, Xipeng Shen

{"title":"寻找可重用的机器学习组件来构建编程语言处理管道","authors":"Patrick Flynn, T. Vanderbruggen, C. Liao, Pei-Hung Lin, M. Emani, Xipeng Shen","doi":"10.48550/arXiv.2208.05596","DOIUrl":null,"url":null,"abstract":"Programming Language Processing (PLP) using machine learning has made vast improvements in the past few years. Increasingly more people are interested in exploring this promising field. However, it is challenging for new researchers and developers to find the right components to construct their own machine learning pipelines, given the diverse PLP tasks to be solved, the large number of datasets and models being released, and the set of complex compilers or tools involved. To improve the findability, accessibility, interoperability and reusability (FAIRness) of machine learning components, we collect and analyze a set of representative papers in the domain of machine learning-based PLP. We then identify and characterize key concepts including PLP tasks, model architectures and supportive tools. Finally, we show some example use cases of leveraging the reusable components to construct machine learning pipelines to solve a set of PLP tasks.","PeriodicalId":386831,"journal":{"name":"European Conference on Software Architecture","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines\",\"authors\":\"Patrick Flynn, T. Vanderbruggen, C. Liao, Pei-Hung Lin, M. Emani, Xipeng Shen\",\"doi\":\"10.48550/arXiv.2208.05596\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Programming Language Processing (PLP) using machine learning has made vast improvements in the past few years. Increasingly more people are interested in exploring this promising field. However, it is challenging for new researchers and developers to find the right components to construct their own machine learning pipelines, given the diverse PLP tasks to be solved, the large number of datasets and models being released, and the set of complex compilers or tools involved. To improve the findability, accessibility, interoperability and reusability (FAIRness) of machine learning components, we collect and analyze a set of representative papers in the domain of machine learning-based PLP. We then identify and characterize key concepts including PLP tasks, model architectures and supportive tools. Finally, we show some example use cases of leveraging the reusable components to construct machine learning pipelines to solve a set of PLP tasks.\",\"PeriodicalId\":386831,\"journal\":{\"name\":\"European Conference on Software Architecture\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Conference on Software Architecture\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2208.05596\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Conference on Software Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2208.05596","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

使用机器学习的编程语言处理(PLP)在过去几年中取得了巨大的进步。越来越多的人对探索这一有前途的领域感兴趣。然而，对于新的研究人员和开发人员来说，找到合适的组件来构建他们自己的机器学习管道是具有挑战性的，因为要解决的PLP任务多种多样，要发布的数据集和模型数量众多，以及涉及的复杂编译器或工具集。为了提高机器学习组件的可查找性、可访问性、互操作性和可重用性(公平性)，我们收集并分析了一组基于机器学习的PLP领域的代表性论文。然后，我们确定并描述关键概念，包括PLP任务，模型架构和支持工具。最后，我们展示了一些利用可重用组件构建机器学习管道来解决一组PLP任务的示例用例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines

Programming Language Processing (PLP) using machine learning has made vast improvements in the past few years. Increasingly more people are interested in exploring this promising field. However, it is challenging for new researchers and developers to find the right components to construct their own machine learning pipelines, given the diverse PLP tasks to be solved, the large number of datasets and models being released, and the set of complex compilers or tools involved. To improve the findability, accessibility, interoperability and reusability (FAIRness) of machine learning components, we collect and analyze a set of representative papers in the domain of machine learning-based PLP. We then identify and characterize key concepts including PLP tasks, model architectures and supportive tools. Finally, we show some example use cases of leveraging the reusable components to construct machine learning pipelines to solve a set of PLP tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

European Conference on Software Architecture

自引率

0.00%

发文量