A Text and Data Analytics Approach to Enrich the Quality of Unstructured Research Information

Otmane Azeroual
{"title":"A Text and Data Analytics Approach to Enrich the Quality of Unstructured Research Information","authors":"Otmane Azeroual","doi":"10.5539/cis.v12n4p84","DOIUrl":null,"url":null,"abstract":"With the increased accessibility of research information, the demands on research information systems (RIS) that are expected to automatically generate and process knowledge are increasing. Furthermore, the quality of the RIS data entries of the individual sources of information causes problems. If the data is structured in RIS, users can read and filter out their information and knowledge needs without any problems. This technique, which nevertheless allows text databases and text sources to be analyzed and knowledge extracted from unknown texts, is referred to as text mining or text data mining based on the principles of data mining. Text mining allows automatically classifying large heterogeneous sources of research information and assigning them to specific topics. Research information has always played a major role in higher education and academic institutions, although they were usually available in unstructured form in RIS and grow faster than structured data. This can be a waste of time searching for RIS staff in universities and can lead to bad decision-making. For this reason, the present paper proposes a new approach to obtaining structured research information from heterogeneous information systems. It is a subset of an approach to the semantic integration of unstructured data using the example of a RIS. The purpose of this paper is to investigate text and data mining methods in the context of RIS and to develop an improvement quality model as an aid to RIS using universities and academic institutions to enrich unstructured research information.","PeriodicalId":14676,"journal":{"name":"J. Chem. Inf. Comput. Sci.","volume":"7 1","pages":"84-95"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Chem. Inf. Comput. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5539/cis.v12n4p84","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

With the increased accessibility of research information, the demands on research information systems (RIS) that are expected to automatically generate and process knowledge are increasing. Furthermore, the quality of the RIS data entries of the individual sources of information causes problems. If the data is structured in RIS, users can read and filter out their information and knowledge needs without any problems. This technique, which nevertheless allows text databases and text sources to be analyzed and knowledge extracted from unknown texts, is referred to as text mining or text data mining based on the principles of data mining. Text mining allows automatically classifying large heterogeneous sources of research information and assigning them to specific topics. Research information has always played a major role in higher education and academic institutions, although they were usually available in unstructured form in RIS and grow faster than structured data. This can be a waste of time searching for RIS staff in universities and can lead to bad decision-making. For this reason, the present paper proposes a new approach to obtaining structured research information from heterogeneous information systems. It is a subset of an approach to the semantic integration of unstructured data using the example of a RIS. The purpose of this paper is to investigate text and data mining methods in the context of RIS and to develop an improvement quality model as an aid to RIS using universities and academic institutions to enrich unstructured research information.
丰富非结构化研究信息质量的文本和数据分析方法
随着研究信息可及性的提高,人们对研究信息系统(RIS)自动生成和处理知识的要求也越来越高。此外,各个信息源的RIS数据条目的质量也会导致问题。如果数据在RIS中结构化,用户可以毫无问题地阅读和过滤出他们需要的信息和知识。这种技术允许对文本数据库和文本源进行分析,并从未知文本中提取知识,根据数据挖掘的原理将其称为文本挖掘或文本数据挖掘。文本挖掘允许自动分类大型异构来源的研究信息,并将它们分配到特定的主题。研究信息一直在高等教育和学术机构中发挥着重要作用,尽管它们通常以RIS中的非结构化形式提供,并且比结构化数据增长得更快。这可能是浪费时间在大学里寻找RIS工作人员,并可能导致错误的决策。为此,本文提出了一种从异构信息系统中获取结构化研究信息的新方法。它是一种非结构化数据语义集成方法的子集,以RIS为例。本文的目的是研究RIS背景下的文本和数据挖掘方法,并开发一个改进质量模型,作为RIS的辅助,利用大学和学术机构丰富非结构化研究信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信