网络词汇变化的多领域可解释预测

Albert Meroño-Peñuela, Romana Pernisch, Christophe Guéret, S. Schlobach
{"title":"网络词汇变化的多领域可解释预测","authors":"Albert Meroño-Peñuela, Romana Pernisch, Christophe Guéret, S. Schlobach","doi":"10.1145/3460210.3493583","DOIUrl":null,"url":null,"abstract":"Web vocabularies (WV) have become a fundamental tool for structuring Web data: over 10 million sites use structured data formats and ontologies to markup content. Maintaining these vocabularies and keeping up with their changes are manual tasks with very limited automated support, impacting both publishers and users. Existing work shows that machine learning can be used to reliably predict vocabulary changes, but on specific domains (e.g. biomedicine) and with limited explanations on the impact of changes (e.g. their type, frequency, etc.). In this paper, we describe a framework that uses various supervised learning models to learn and predict changes in versioned vocabularies, independent of their domain. Using well-established results in ontology evolution we extract domain-agnostic and human-interpretable features and explain their influence on change predictability. Applying our method on 139 WV from 9 different domains, we find that ontology structural and instance data, the number of versions, and the release frequency highly correlate with predictability of change. These results can pave the way towards integrating predictive models into knowledge engineering practices and methods.","PeriodicalId":377331,"journal":{"name":"Proceedings of the 11th on Knowledge Capture Conference","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Multi-domain and Explainable Prediction of Changes in Web Vocabularies\",\"authors\":\"Albert Meroño-Peñuela, Romana Pernisch, Christophe Guéret, S. Schlobach\",\"doi\":\"10.1145/3460210.3493583\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Web vocabularies (WV) have become a fundamental tool for structuring Web data: over 10 million sites use structured data formats and ontologies to markup content. Maintaining these vocabularies and keeping up with their changes are manual tasks with very limited automated support, impacting both publishers and users. Existing work shows that machine learning can be used to reliably predict vocabulary changes, but on specific domains (e.g. biomedicine) and with limited explanations on the impact of changes (e.g. their type, frequency, etc.). In this paper, we describe a framework that uses various supervised learning models to learn and predict changes in versioned vocabularies, independent of their domain. Using well-established results in ontology evolution we extract domain-agnostic and human-interpretable features and explain their influence on change predictability. Applying our method on 139 WV from 9 different domains, we find that ontology structural and instance data, the number of versions, and the release frequency highly correlate with predictability of change. These results can pave the way towards integrating predictive models into knowledge engineering practices and methods.\",\"PeriodicalId\":377331,\"journal\":{\"name\":\"Proceedings of the 11th on Knowledge Capture Conference\",\"volume\":\"97 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 11th on Knowledge Capture Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3460210.3493583\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th on Knowledge Capture Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3460210.3493583","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

Web词汇表(WV)已经成为构建Web数据的基本工具:超过1000万个站点使用结构化数据格式和本体来标记内容。维护这些词汇表并跟上它们的变化是手工任务,自动化支持非常有限,对发布者和用户都有影响。现有的工作表明,机器学习可以用来可靠地预测词汇的变化,但在特定的领域(例如生物医学),并且对变化的影响(例如其类型,频率等)的解释有限。在本文中,我们描述了一个框架,该框架使用各种监督学习模型来学习和预测版本化词汇表的变化,而不依赖于它们的领域。利用本体进化的成熟结果,我们提取领域不可知论和人类可解释的特征,并解释它们对变化可预测性的影响。将该方法应用于9个不同领域的139个WV,我们发现本体结构和实例数据、版本数量和发布频率与变化的可预测性高度相关。这些结果可以为将预测模型集成到知识工程实践和方法中铺平道路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-domain and Explainable Prediction of Changes in Web Vocabularies
Web vocabularies (WV) have become a fundamental tool for structuring Web data: over 10 million sites use structured data formats and ontologies to markup content. Maintaining these vocabularies and keeping up with their changes are manual tasks with very limited automated support, impacting both publishers and users. Existing work shows that machine learning can be used to reliably predict vocabulary changes, but on specific domains (e.g. biomedicine) and with limited explanations on the impact of changes (e.g. their type, frequency, etc.). In this paper, we describe a framework that uses various supervised learning models to learn and predict changes in versioned vocabularies, independent of their domain. Using well-established results in ontology evolution we extract domain-agnostic and human-interpretable features and explain their influence on change predictability. Applying our method on 139 WV from 9 different domains, we find that ontology structural and instance data, the number of versions, and the release frequency highly correlate with predictability of change. These results can pave the way towards integrating predictive models into knowledge engineering practices and methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信