用于机器学习实验来源的可互操作服务

Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics Pub Date : 2017-08-23 DOI:10.1145/3106426.3106496

J. C. Duarte, M. C. Cavalcanti, Igor de Souza Costa, Diego Esteves

{"title":"用于机器学习实验来源的可互操作服务","authors":"J. C. Duarte, M. C. Cavalcanti, Igor de Souza Costa, Diego Esteves","doi":"10.1145/3106426.3106496","DOIUrl":null,"url":null,"abstract":"Nowadays, despite the fact that Machine Learning (ML) experiments can be easily built using several ML frameworks, as the demand for practical solutions for several kinds of scientific problems is always increasing, organizing its results and the different algorithms' setups used, in order to be able to reproduce them, is a long known problem without an easy solution. Motivated by the need of a high level of interoperability and data provenance with respect to ML experiments, this work presents a generic solution using a web-service application that interacts with the MEX vocabulary, a lightweight solution for archiving and querying ML experiments. By using this solution, researchers can share their setups and results, in a interoperable format that describes all the steps needed to reproduce their research. Although the solution presented in this work could be implemented in any programming language, we chose Java to build the web-service and also we chose to present experiments with Python's Scikit-learn ML Framework, using Decorators and Code Reflection, that demonstrates the simplicity of incorporating data provenance in such a high level, simplifying the experiment logging process.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An interoperable service for the provenance of machine learning experiments\",\"authors\":\"J. C. Duarte, M. C. Cavalcanti, Igor de Souza Costa, Diego Esteves\",\"doi\":\"10.1145/3106426.3106496\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, despite the fact that Machine Learning (ML) experiments can be easily built using several ML frameworks, as the demand for practical solutions for several kinds of scientific problems is always increasing, organizing its results and the different algorithms' setups used, in order to be able to reproduce them, is a long known problem without an easy solution. Motivated by the need of a high level of interoperability and data provenance with respect to ML experiments, this work presents a generic solution using a web-service application that interacts with the MEX vocabulary, a lightweight solution for archiving and querying ML experiments. By using this solution, researchers can share their setups and results, in a interoperable format that describes all the steps needed to reproduce their research. Although the solution presented in this work could be implemented in any programming language, we chose Java to build the web-service and also we chose to present experiments with Python's Scikit-learn ML Framework, using Decorators and Code Reflection, that demonstrates the simplicity of incorporating data provenance in such a high level, simplifying the experiment logging process.\",\"PeriodicalId\":20685,\"journal\":{\"name\":\"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3106426.3106496\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3106426.3106496","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

如今，尽管机器学习(ML)实验可以使用几个ML框架轻松构建，但由于对几种科学问题的实际解决方案的需求一直在增加，为了能够重现它们，组织其结果和使用的不同算法设置是一个长期存在的问题，没有一个简单的解决方案。由于ML实验需要高水平的互操作性和数据来源，这项工作提出了一个通用的解决方案，使用一个与MEX词汇表交互的web服务应用程序，一个用于存档和查询ML实验的轻量级解决方案。通过使用这个解决方案，研究人员可以以一种可互操作的格式共享他们的设置和结果，该格式描述了重现他们的研究所需的所有步骤。虽然本工作中提出的解决方案可以在任何编程语言中实现，但我们选择Java来构建web服务，并且我们选择使用Python的Scikit-learn ML框架进行实验，使用装饰器和代码反射，这表明了在如此高的级别上合并数据来源的简单性，简化了实验日志记录过程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An interoperable service for the provenance of machine learning experiments

Nowadays, despite the fact that Machine Learning (ML) experiments can be easily built using several ML frameworks, as the demand for practical solutions for several kinds of scientific problems is always increasing, organizing its results and the different algorithms' setups used, in order to be able to reproduce them, is a long known problem without an easy solution. Motivated by the need of a high level of interoperability and data provenance with respect to ML experiments, this work presents a generic solution using a web-service application that interacts with the MEX vocabulary, a lightweight solution for archiving and querying ML experiments. By using this solution, researchers can share their setups and results, in a interoperable format that describes all the steps needed to reproduce their research. Although the solution presented in this work could be implemented in any programming language, we chose Java to build the web-service and also we chose to present experiments with Python's Scikit-learn ML Framework, using Decorators and Code Reflection, that demonstrates the simplicity of incorporating data provenance in such a high level, simplifying the experiment logging process.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics

自引率

0.00%

发文量