寻找可重用的机器学习组件来构建编程语言处理管道

Patrick Flynn, T. Vanderbruggen, C. Liao, Pei-Hung Lin, M. Emani, Xipeng Shen
{"title":"寻找可重用的机器学习组件来构建编程语言处理管道","authors":"Patrick Flynn, T. Vanderbruggen, C. Liao, Pei-Hung Lin, M. Emani, Xipeng Shen","doi":"10.48550/arXiv.2208.05596","DOIUrl":null,"url":null,"abstract":"Programming Language Processing (PLP) using machine learning has made vast improvements in the past few years. Increasingly more people are interested in exploring this promising field. However, it is challenging for new researchers and developers to find the right components to construct their own machine learning pipelines, given the diverse PLP tasks to be solved, the large number of datasets and models being released, and the set of complex compilers or tools involved. To improve the findability, accessibility, interoperability and reusability (FAIRness) of machine learning components, we collect and analyze a set of representative papers in the domain of machine learning-based PLP. We then identify and characterize key concepts including PLP tasks, model architectures and supportive tools. Finally, we show some example use cases of leveraging the reusable components to construct machine learning pipelines to solve a set of PLP tasks.","PeriodicalId":386831,"journal":{"name":"European Conference on Software Architecture","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines\",\"authors\":\"Patrick Flynn, T. Vanderbruggen, C. Liao, Pei-Hung Lin, M. Emani, Xipeng Shen\",\"doi\":\"10.48550/arXiv.2208.05596\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Programming Language Processing (PLP) using machine learning has made vast improvements in the past few years. Increasingly more people are interested in exploring this promising field. However, it is challenging for new researchers and developers to find the right components to construct their own machine learning pipelines, given the diverse PLP tasks to be solved, the large number of datasets and models being released, and the set of complex compilers or tools involved. To improve the findability, accessibility, interoperability and reusability (FAIRness) of machine learning components, we collect and analyze a set of representative papers in the domain of machine learning-based PLP. We then identify and characterize key concepts including PLP tasks, model architectures and supportive tools. Finally, we show some example use cases of leveraging the reusable components to construct machine learning pipelines to solve a set of PLP tasks.\",\"PeriodicalId\":386831,\"journal\":{\"name\":\"European Conference on Software Architecture\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Conference on Software Architecture\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2208.05596\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Conference on Software Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2208.05596","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

使用机器学习的编程语言处理(PLP)在过去几年中取得了巨大的进步。越来越多的人对探索这一有前途的领域感兴趣。然而,对于新的研究人员和开发人员来说,找到合适的组件来构建他们自己的机器学习管道是具有挑战性的,因为要解决的PLP任务多种多样,要发布的数据集和模型数量众多,以及涉及的复杂编译器或工具集。为了提高机器学习组件的可查找性、可访问性、互操作性和可重用性(公平性),我们收集并分析了一组基于机器学习的PLP领域的代表性论文。然后,我们确定并描述关键概念,包括PLP任务,模型架构和支持工具。最后,我们展示了一些利用可重用组件构建机器学习管道来解决一组PLP任务的示例用例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines
Programming Language Processing (PLP) using machine learning has made vast improvements in the past few years. Increasingly more people are interested in exploring this promising field. However, it is challenging for new researchers and developers to find the right components to construct their own machine learning pipelines, given the diverse PLP tasks to be solved, the large number of datasets and models being released, and the set of complex compilers or tools involved. To improve the findability, accessibility, interoperability and reusability (FAIRness) of machine learning components, we collect and analyze a set of representative papers in the domain of machine learning-based PLP. We then identify and characterize key concepts including PLP tasks, model architectures and supportive tools. Finally, we show some example use cases of leveraging the reusable components to construct machine learning pipelines to solve a set of PLP tasks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信