从公共数据构建用于预测分析的知识图谱:预测技术未来时空的案例研究

Weiwei Duan, Yao-Yi Chiang
{"title":"从公共数据构建用于预测分析的知识图谱:预测技术未来时空的案例研究","authors":"Weiwei Duan, Yao-Yi Chiang","doi":"10.1145/3006386.3006388","DOIUrl":null,"url":null,"abstract":"A domain expert can process heterogeneous data to make meaningful interpretations or predictions from the data. For example, by looking at research papers and patent records, an expert can determine the maturity of an emerging technology and predict the geographic location(s) and time (e.g., in a certain year) where and when the technology will be a success. However, this is an expert- and manual-intensive task. This paper presents an end-to-end system that integrates heterogeneous data sources into a knowledge graph in the RDF (Resource Description Framework) format using an ontology. Then the user can easily query the knowledge graph to prepare the required data for different types of predictive analysis tools. We show a case study of predicting the (geographic) center(s) of fuel cell technologies using data collected from public sources to demonstrate the feasibility of our system. The system extracts, cleanses, and augments data from public sources including research papers and patent records. Next, the system uses an ontology-based data integration method to generate knowledge graphs in the RDF format to enable users to switch quickly between machine learning models for predictive analytic tasks. We tested the system using the Support Vector Machine and Multiple Hidden Markov Models and achieved 66.7% and 83.3% accuracy on the city and year levels of spatial and temporal resolutions, respectively.","PeriodicalId":416086,"journal":{"name":"International Workshop on Analytics for Big Geospatial Data","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Building knowledge graph from public data for predictive analysis: a case study on predicting technology future in space and time\",\"authors\":\"Weiwei Duan, Yao-Yi Chiang\",\"doi\":\"10.1145/3006386.3006388\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A domain expert can process heterogeneous data to make meaningful interpretations or predictions from the data. For example, by looking at research papers and patent records, an expert can determine the maturity of an emerging technology and predict the geographic location(s) and time (e.g., in a certain year) where and when the technology will be a success. However, this is an expert- and manual-intensive task. This paper presents an end-to-end system that integrates heterogeneous data sources into a knowledge graph in the RDF (Resource Description Framework) format using an ontology. Then the user can easily query the knowledge graph to prepare the required data for different types of predictive analysis tools. We show a case study of predicting the (geographic) center(s) of fuel cell technologies using data collected from public sources to demonstrate the feasibility of our system. The system extracts, cleanses, and augments data from public sources including research papers and patent records. Next, the system uses an ontology-based data integration method to generate knowledge graphs in the RDF format to enable users to switch quickly between machine learning models for predictive analytic tasks. We tested the system using the Support Vector Machine and Multiple Hidden Markov Models and achieved 66.7% and 83.3% accuracy on the city and year levels of spatial and temporal resolutions, respectively.\",\"PeriodicalId\":416086,\"journal\":{\"name\":\"International Workshop on Analytics for Big Geospatial Data\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Workshop on Analytics for Big Geospatial Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3006386.3006388\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Analytics for Big Geospatial Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3006386.3006388","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

领域专家可以处理异构数据,从数据中做出有意义的解释或预测。例如,通过查看研究论文和专利记录,专家可以确定一项新兴技术的成熟度,并预测该技术将在何时何地取得成功的地理位置和时间(例如,在某一年)。然而,这是一项专家和手工密集型的任务。本文提出了一个端到端系统,该系统使用本体将异构数据源集成到RDF(资源描述框架)格式的知识图谱中。然后,用户可以方便地查询知识图谱,为不同类型的预测分析工具准备所需的数据。我们展示了一个案例研究,使用从公共来源收集的数据来预测燃料电池技术的(地理)中心,以证明我们系统的可行性。该系统从包括研究论文和专利记录在内的公共资源中提取、清理和增加数据。接下来,系统使用基于本体的数据集成方法生成RDF格式的知识图,使用户能够在预测分析任务的机器学习模型之间快速切换。我们使用支持向量机和多重隐马尔可夫模型对系统进行了测试,在城市和年份的空间和时间分辨率上分别达到了66.7%和83.3%的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Building knowledge graph from public data for predictive analysis: a case study on predicting technology future in space and time
A domain expert can process heterogeneous data to make meaningful interpretations or predictions from the data. For example, by looking at research papers and patent records, an expert can determine the maturity of an emerging technology and predict the geographic location(s) and time (e.g., in a certain year) where and when the technology will be a success. However, this is an expert- and manual-intensive task. This paper presents an end-to-end system that integrates heterogeneous data sources into a knowledge graph in the RDF (Resource Description Framework) format using an ontology. Then the user can easily query the knowledge graph to prepare the required data for different types of predictive analysis tools. We show a case study of predicting the (geographic) center(s) of fuel cell technologies using data collected from public sources to demonstrate the feasibility of our system. The system extracts, cleanses, and augments data from public sources including research papers and patent records. Next, the system uses an ontology-based data integration method to generate knowledge graphs in the RDF format to enable users to switch quickly between machine learning models for predictive analytic tasks. We tested the system using the Support Vector Machine and Multiple Hidden Markov Models and achieved 66.7% and 83.3% accuracy on the city and year levels of spatial and temporal resolutions, respectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信