Methods for Intelligent Data Analysis Based on Keywords and Implicit Relations: The Case of "ISTINA" Data Analysis System

V. Vasenin, K. Lunev, S. Afonin, D. Shachnev
{"title":"Methods for Intelligent Data Analysis Based on Keywords and Implicit Relations: The Case of \"ISTINA\" Data Analysis System","authors":"V. Vasenin, K. Lunev, S. Afonin, D. Shachnev","doi":"10.1109/APSSE47353.2019.00027","DOIUrl":null,"url":null,"abstract":"In information analysis systems that are working with big data, there often arises a need to classify objects and calculate the degree of thematic proximity between two objects. One of the natural sources of data for solving such problems are keywords that are attributed to objects of the system. In this paper, a model for calculating the degree of thematic proximity between two keywords as well as between two sets of keywords is described. This model is based on contextual proximity between keywords, which means the number of sets where the two keywords are present together. When calculating the final proximity coefficient, such properties of keywords as abstractness degree and thematic belonging are taken into account. Various ways to use the developed model for solving practical tasks are described, on the example of \"ISTINA\" scientometric data analysis system in Lomonosov Moscow State University.","PeriodicalId":146774,"journal":{"name":"2019 Actual Problems of Systems and Software Engineering (APSSE)","volume":"397 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Actual Problems of Systems and Software Engineering (APSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSSE47353.2019.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

In information analysis systems that are working with big data, there often arises a need to classify objects and calculate the degree of thematic proximity between two objects. One of the natural sources of data for solving such problems are keywords that are attributed to objects of the system. In this paper, a model for calculating the degree of thematic proximity between two keywords as well as between two sets of keywords is described. This model is based on contextual proximity between keywords, which means the number of sets where the two keywords are present together. When calculating the final proximity coefficient, such properties of keywords as abstractness degree and thematic belonging are taken into account. Various ways to use the developed model for solving practical tasks are described, on the example of "ISTINA" scientometric data analysis system in Lomonosov Moscow State University.
基于关键词和隐式关系的智能数据分析方法——以“ISTINA”数据分析系统为例
在处理大数据的信息分析系统中,经常需要对对象进行分类并计算两个对象之间的主题接近程度。用于解决此类问题的自然数据源之一是归属于系统对象的关键字。本文描述了一个计算两个关键词之间以及两组关键词之间主题接近度的模型。该模型基于关键字之间的上下文接近度,这意味着两个关键字同时出现的集合数量。在计算最终接近系数时,考虑了关键词的抽象度、主题性归属等属性。本文以莫斯科国立大学的“ISTINA”科学计量数据分析系统为例,介绍了将所开发的模型用于解决实际任务的各种方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信