采用机器学习方法的自动语义注释：系统性综述

Mendel Pub Date : 2023-12-20 DOI:10.13164/mendel.2023.2.111

Wee Chea Chang, A. Sangodiah

{"title":"采用机器学习方法的自动语义注释：系统性综述","authors":"Wee Chea Chang, A. Sangodiah","doi":"10.13164/mendel.2023.2.111","DOIUrl":null,"url":null,"abstract":"Semantic Web is the vision to make Internet data machine-readable to achieve information retrieval with higher granularity and personalisation. Semantic annotation is the process that binds machine-understandable descriptions into Web resources such as text and images. Hence, the success of Semantic Web dependson the wide availability of semantically annotated Web resources. However, there remains a huge amount of unannotated Web resources due to the limited annotation capability available. In order to address this, machine learning approaches have been used to improve the automation process. This Systematic Review aims to summarise the existing state-of-the-art literature to answer five Research Questions focusing on machine learning driven semantic annotation automation. The analysis of 40 selected primary studies reveals that the use of unitary and combination of machine learning algorithms are both the current directions. SupportVector Machine (SVM) is the most-used algorithm, and supervised learning is the predominant machine learning type. Both semi-automated and fully automated annotation are almost nearly achieved. Meanwhile, text is the most annotated Web resource; and the availability of third-party annotation tools is in-line with this. While Precision, Recall, F-Measure and Accuracy are the most deployed quality metrics, not all the studies measured the quality of the annotated results. In the future, standardising quality measures is the direction for research.","PeriodicalId":38293,"journal":{"name":"Mendel","volume":"49 15","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Automated Semantic Annotation Deploying Machine Learning Approaches: A Systematic Review\",\"authors\":\"Wee Chea Chang, A. Sangodiah\",\"doi\":\"10.13164/mendel.2023.2.111\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semantic Web is the vision to make Internet data machine-readable to achieve information retrieval with higher granularity and personalisation. Semantic annotation is the process that binds machine-understandable descriptions into Web resources such as text and images. Hence, the success of Semantic Web dependson the wide availability of semantically annotated Web resources. However, there remains a huge amount of unannotated Web resources due to the limited annotation capability available. In order to address this, machine learning approaches have been used to improve the automation process. This Systematic Review aims to summarise the existing state-of-the-art literature to answer five Research Questions focusing on machine learning driven semantic annotation automation. The analysis of 40 selected primary studies reveals that the use of unitary and combination of machine learning algorithms are both the current directions. SupportVector Machine (SVM) is the most-used algorithm, and supervised learning is the predominant machine learning type. Both semi-automated and fully automated annotation are almost nearly achieved. Meanwhile, text is the most annotated Web resource; and the availability of third-party annotation tools is in-line with this. While Precision, Recall, F-Measure and Accuracy are the most deployed quality metrics, not all the studies measured the quality of the annotated results. In the future, standardising quality measures is the direction for research.\",\"PeriodicalId\":38293,\"journal\":{\"name\":\"Mendel\",\"volume\":\"49 15\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mendel\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.13164/mendel.2023.2.111\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mendel","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13164/mendel.2023.2.111","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

语义网（Semantic Web）的愿景是使互联网数据具有机器可读性，从而实现更高分辨率和个性化的信息检索。语义注释是将机器可理解的描述绑定到文本和图像等网络资源中的过程。因此，语义网的成功取决于语义注释网络资源的广泛可用性。然而，由于现有的注释能力有限，仍然存在大量未注释的网络资源。为了解决这个问题，人们采用了机器学习方法来改进自动化流程。本系统综述旨在总结现有的最新文献，回答五个研究问题，重点是机器学习驱动的语义注释自动化。对所选 40 项主要研究的分析表明，使用单元式机器学习算法和组合式机器学习算法都是当前的研究方向。支持向量机（SVM）是使用最多的算法，监督学习是最主要的机器学习类型。半自动和全自动标注几乎都已实现。同时，文本是注释最多的网络资源；第三方注释工具的可用性也与此相符。虽然精度、召回率、F-测量值和准确率是使用最多的质量度量标准，但并非所有研究都对注释结果的质量进行了测量。未来，质量衡量标准的标准化是研究的方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automated Semantic Annotation Deploying Machine Learning Approaches: A Systematic Review

Semantic Web is the vision to make Internet data machine-readable to achieve information retrieval with higher granularity and personalisation. Semantic annotation is the process that binds machine-understandable descriptions into Web resources such as text and images. Hence, the success of Semantic Web dependson the wide availability of semantically annotated Web resources. However, there remains a huge amount of unannotated Web resources due to the limited annotation capability available. In order to address this, machine learning approaches have been used to improve the automation process. This Systematic Review aims to summarise the existing state-of-the-art literature to answer five Research Questions focusing on machine learning driven semantic annotation automation. The analysis of 40 selected primary studies reveals that the use of unitary and combination of machine learning algorithms are both the current directions. SupportVector Machine (SVM) is the most-used algorithm, and supervised learning is the predominant machine learning type. Both semi-automated and fully automated annotation are almost nearly achieved. Meanwhile, text is the most annotated Web resource; and the availability of third-party annotation tools is in-line with this. While Precision, Recall, F-Measure and Accuracy are the most deployed quality metrics, not all the studies measured the quality of the annotated results. In the future, standardising quality measures is the direction for research.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Mendel Decision Sciences-Decision Sciences (miscellaneous)

CiteScore

2.20

自引率

0.00%

发文量