An interoperable service for the provenance of machine learning experiments

J. C. Duarte, M. C. Cavalcanti, Igor de Souza Costa, Diego Esteves
{"title":"An interoperable service for the provenance of machine learning experiments","authors":"J. C. Duarte, M. C. Cavalcanti, Igor de Souza Costa, Diego Esteves","doi":"10.1145/3106426.3106496","DOIUrl":null,"url":null,"abstract":"Nowadays, despite the fact that Machine Learning (ML) experiments can be easily built using several ML frameworks, as the demand for practical solutions for several kinds of scientific problems is always increasing, organizing its results and the different algorithms' setups used, in order to be able to reproduce them, is a long known problem without an easy solution. Motivated by the need of a high level of interoperability and data provenance with respect to ML experiments, this work presents a generic solution using a web-service application that interacts with the MEX vocabulary, a lightweight solution for archiving and querying ML experiments. By using this solution, researchers can share their setups and results, in a interoperable format that describes all the steps needed to reproduce their research. Although the solution presented in this work could be implemented in any programming language, we chose Java to build the web-service and also we chose to present experiments with Python's Scikit-learn ML Framework, using Decorators and Code Reflection, that demonstrates the simplicity of incorporating data provenance in such a high level, simplifying the experiment logging process.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3106426.3106496","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Nowadays, despite the fact that Machine Learning (ML) experiments can be easily built using several ML frameworks, as the demand for practical solutions for several kinds of scientific problems is always increasing, organizing its results and the different algorithms' setups used, in order to be able to reproduce them, is a long known problem without an easy solution. Motivated by the need of a high level of interoperability and data provenance with respect to ML experiments, this work presents a generic solution using a web-service application that interacts with the MEX vocabulary, a lightweight solution for archiving and querying ML experiments. By using this solution, researchers can share their setups and results, in a interoperable format that describes all the steps needed to reproduce their research. Although the solution presented in this work could be implemented in any programming language, we chose Java to build the web-service and also we chose to present experiments with Python's Scikit-learn ML Framework, using Decorators and Code Reflection, that demonstrates the simplicity of incorporating data provenance in such a high level, simplifying the experiment logging process.
用于机器学习实验来源的可互操作服务
如今,尽管机器学习(ML)实验可以使用几个ML框架轻松构建,但由于对几种科学问题的实际解决方案的需求一直在增加,为了能够重现它们,组织其结果和使用的不同算法设置是一个长期存在的问题,没有一个简单的解决方案。由于ML实验需要高水平的互操作性和数据来源,这项工作提出了一个通用的解决方案,使用一个与MEX词汇表交互的web服务应用程序,一个用于存档和查询ML实验的轻量级解决方案。通过使用这个解决方案,研究人员可以以一种可互操作的格式共享他们的设置和结果,该格式描述了重现他们的研究所需的所有步骤。虽然本工作中提出的解决方案可以在任何编程语言中实现,但我们选择Java来构建web服务,并且我们选择使用Python的Scikit-learn ML框架进行实验,使用装饰器和代码反射,这表明了在如此高的级别上合并数据来源的简单性,简化了实验日志记录过程。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信