J. C. Duarte, M. C. Cavalcanti, Igor de Souza Costa, Diego Esteves
{"title":"用于机器学习实验来源的可互操作服务","authors":"J. C. Duarte, M. C. Cavalcanti, Igor de Souza Costa, Diego Esteves","doi":"10.1145/3106426.3106496","DOIUrl":null,"url":null,"abstract":"Nowadays, despite the fact that Machine Learning (ML) experiments can be easily built using several ML frameworks, as the demand for practical solutions for several kinds of scientific problems is always increasing, organizing its results and the different algorithms' setups used, in order to be able to reproduce them, is a long known problem without an easy solution. Motivated by the need of a high level of interoperability and data provenance with respect to ML experiments, this work presents a generic solution using a web-service application that interacts with the MEX vocabulary, a lightweight solution for archiving and querying ML experiments. By using this solution, researchers can share their setups and results, in a interoperable format that describes all the steps needed to reproduce their research. Although the solution presented in this work could be implemented in any programming language, we chose Java to build the web-service and also we chose to present experiments with Python's Scikit-learn ML Framework, using Decorators and Code Reflection, that demonstrates the simplicity of incorporating data provenance in such a high level, simplifying the experiment logging process.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An interoperable service for the provenance of machine learning experiments\",\"authors\":\"J. C. Duarte, M. C. Cavalcanti, Igor de Souza Costa, Diego Esteves\",\"doi\":\"10.1145/3106426.3106496\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, despite the fact that Machine Learning (ML) experiments can be easily built using several ML frameworks, as the demand for practical solutions for several kinds of scientific problems is always increasing, organizing its results and the different algorithms' setups used, in order to be able to reproduce them, is a long known problem without an easy solution. Motivated by the need of a high level of interoperability and data provenance with respect to ML experiments, this work presents a generic solution using a web-service application that interacts with the MEX vocabulary, a lightweight solution for archiving and querying ML experiments. By using this solution, researchers can share their setups and results, in a interoperable format that describes all the steps needed to reproduce their research. Although the solution presented in this work could be implemented in any programming language, we chose Java to build the web-service and also we chose to present experiments with Python's Scikit-learn ML Framework, using Decorators and Code Reflection, that demonstrates the simplicity of incorporating data provenance in such a high level, simplifying the experiment logging process.\",\"PeriodicalId\":20685,\"journal\":{\"name\":\"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3106426.3106496\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3106426.3106496","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An interoperable service for the provenance of machine learning experiments
Nowadays, despite the fact that Machine Learning (ML) experiments can be easily built using several ML frameworks, as the demand for practical solutions for several kinds of scientific problems is always increasing, organizing its results and the different algorithms' setups used, in order to be able to reproduce them, is a long known problem without an easy solution. Motivated by the need of a high level of interoperability and data provenance with respect to ML experiments, this work presents a generic solution using a web-service application that interacts with the MEX vocabulary, a lightweight solution for archiving and querying ML experiments. By using this solution, researchers can share their setups and results, in a interoperable format that describes all the steps needed to reproduce their research. Although the solution presented in this work could be implemented in any programming language, we chose Java to build the web-service and also we chose to present experiments with Python's Scikit-learn ML Framework, using Decorators and Code Reflection, that demonstrates the simplicity of incorporating data provenance in such a high level, simplifying the experiment logging process.