MEX接口:自动化机器学习元数据生成

Proceedings of the 12th International Conference on Semantic Systems Pub Date : 2016-09-12 DOI:10.1145/2993318.2993320

Diego Esteves, Pablo N. Mendes, Diego Moussallem, J. C. Duarte, A. Zaveri, Jens Lehmann

{"title":"MEX接口:自动化机器学习元数据生成","authors":"Diego Esteves, Pablo N. Mendes, Diego Moussallem, J. C. Duarte, A. Zaveri, Jens Lehmann","doi":"10.1145/2993318.2993320","DOIUrl":null,"url":null,"abstract":"Despite recent efforts to achieve a high level of interoperability of Machine Learning (ML) experiments, positively collaborating with the Reproducible Research context, we still run into problems created due to the existence of different ML platforms: each of those have a specific conceptualization or schema for representing data and metadata. This scenario leads to an extra coding-effort to achieve both the desired interoperability and a better provenance level as well as a more automatized environment for obtaining the generated results. Hence, when using ML libraries, it is a common task to re-design specific data models (schemata) and develop wrappers to manage the produced outputs. In this article, we discuss this gap focusing on the solution for the question: \"What is the cleanest and lowest-impact solution, i.e., the minimal effort to achieve both higher interoperability and provenance metadata levels in the Integrated Development Environments (IDE) context and how to facilitate the inherent data querying task?\". We introduce a novel and low-impact methodology specifically designed for code built in that context, combining Semantic Web concepts and reflection in order to minimize the gap for exporting ML metadata in a structured manner, allowing embedded code annotations that are, in run-time, converted in one of the state-of-the-art ML schemas for the Semantic Web: MEX Vocabulary.","PeriodicalId":177013,"journal":{"name":"Proceedings of the 12th International Conference on Semantic Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"MEX Interfaces: Automating Machine Learning Metadata Generation\",\"authors\":\"Diego Esteves, Pablo N. Mendes, Diego Moussallem, J. C. Duarte, A. Zaveri, Jens Lehmann\",\"doi\":\"10.1145/2993318.2993320\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Despite recent efforts to achieve a high level of interoperability of Machine Learning (ML) experiments, positively collaborating with the Reproducible Research context, we still run into problems created due to the existence of different ML platforms: each of those have a specific conceptualization or schema for representing data and metadata. This scenario leads to an extra coding-effort to achieve both the desired interoperability and a better provenance level as well as a more automatized environment for obtaining the generated results. Hence, when using ML libraries, it is a common task to re-design specific data models (schemata) and develop wrappers to manage the produced outputs. In this article, we discuss this gap focusing on the solution for the question: \\\"What is the cleanest and lowest-impact solution, i.e., the minimal effort to achieve both higher interoperability and provenance metadata levels in the Integrated Development Environments (IDE) context and how to facilitate the inherent data querying task?\\\". We introduce a novel and low-impact methodology specifically designed for code built in that context, combining Semantic Web concepts and reflection in order to minimize the gap for exporting ML metadata in a structured manner, allowing embedded code annotations that are, in run-time, converted in one of the state-of-the-art ML schemas for the Semantic Web: MEX Vocabulary.\",\"PeriodicalId\":177013,\"journal\":{\"name\":\"Proceedings of the 12th International Conference on Semantic Systems\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 12th International Conference on Semantic Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2993318.2993320\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th International Conference on Semantic Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2993318.2993320","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

尽管最近努力实现机器学习(ML)实验的高水平互操作性，积极地与可复制研究上下文合作，但我们仍然遇到由于不同ML平台的存在而产生的问题:每个平台都有一个特定的概念或模式来表示数据和元数据。这种情况会导致额外的编码工作，以实现所需的互操作性和更好的来源级别，以及获得生成结果的更自动化的环境。因此，在使用ML库时，重新设计特定的数据模型(模式)和开发包装器来管理生成的输出是一项常见的任务。在本文中，我们将重点讨论以下问题的解决方案:“什么是最干净和影响最小的解决方案，即在集成开发环境(IDE)上下文中实现更高的互操作性和源元数据级别的最小努力，以及如何促进固有的数据查询任务?”我们引入了一种新颖的、低影响的方法，专门为在这种情况下构建的代码设计，结合语义Web概念和反射，以最大限度地减少以结构化方式导出ML元数据的差距，允许在运行时将嵌入的代码注释转换为用于语义Web的最先进的ML模式之一:MEX Vocabulary。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MEX Interfaces: Automating Machine Learning Metadata Generation

Despite recent efforts to achieve a high level of interoperability of Machine Learning (ML) experiments, positively collaborating with the Reproducible Research context, we still run into problems created due to the existence of different ML platforms: each of those have a specific conceptualization or schema for representing data and metadata. This scenario leads to an extra coding-effort to achieve both the desired interoperability and a better provenance level as well as a more automatized environment for obtaining the generated results. Hence, when using ML libraries, it is a common task to re-design specific data models (schemata) and develop wrappers to manage the produced outputs. In this article, we discuss this gap focusing on the solution for the question: "What is the cleanest and lowest-impact solution, i.e., the minimal effort to achieve both higher interoperability and provenance metadata levels in the Integrated Development Environments (IDE) context and how to facilitate the inherent data querying task?". We introduce a novel and low-impact methodology specifically designed for code built in that context, combining Semantic Web concepts and reflection in order to minimize the gap for exporting ML metadata in a structured manner, allowing embedded code annotations that are, in run-time, converted in one of the state-of-the-art ML schemas for the Semantic Web: MEX Vocabulary.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 12th International Conference on Semantic Systems

自引率

0.00%

发文量