使用Elasticsearch为社交网络中收集的用户活动数据设计数据仓库

Path of Science Pub Date : 2023-07-31 DOI:10.22178/pos.94-13
I. Mysiuk
{"title":"使用Elasticsearch为社交网络中收集的用户活动数据设计数据仓库","authors":"I. Mysiuk","doi":"10.22178/pos.94-13","DOIUrl":null,"url":null,"abstract":". In this paper, a data storage data warehouse is designed to store collected data from social networks. Creating indexes with data and selecting a configuration with the appropriate number of shards and replicas is described – the primary states of the cluster and possibilities of its scaling. The features of working with the non-relational Elasticsearch database are described when working with data on user activity in social network posts. Among social networks, Facebook and Instagram were chosen for analysis. The paper describes the advantages and disadvantages of using such a data store compared to Apache Kafka. Analysed existing data insertion Application Program Interfaces (APIs) and data visualisation tools integrated with Elasticsearch. The study describes the use of the Bulk API to insert many records at once into a database. The designed data warehouse uses Kibana, a data visualisation and analytics tool integrated with the selected database. Also, it is shown the ability to insert and view logs using Elasticsearch, Logstash, and Kibana (ELK stack). Tested data ingest by logging into the database using Beats. The obtained results can help implement a system for analysing user activities from social network data based on Elasticsearch as a central component.","PeriodicalId":19980,"journal":{"name":"Path of Science","volume":"25 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Designing a Data Warehouse for Collected Data About User Activity in Social Networks Using Elasticsearch\",\"authors\":\"I. Mysiuk\",\"doi\":\"10.22178/pos.94-13\",\"DOIUrl\":null,\"url\":null,\"abstract\":\". In this paper, a data storage data warehouse is designed to store collected data from social networks. Creating indexes with data and selecting a configuration with the appropriate number of shards and replicas is described – the primary states of the cluster and possibilities of its scaling. The features of working with the non-relational Elasticsearch database are described when working with data on user activity in social network posts. Among social networks, Facebook and Instagram were chosen for analysis. The paper describes the advantages and disadvantages of using such a data store compared to Apache Kafka. Analysed existing data insertion Application Program Interfaces (APIs) and data visualisation tools integrated with Elasticsearch. The study describes the use of the Bulk API to insert many records at once into a database. The designed data warehouse uses Kibana, a data visualisation and analytics tool integrated with the selected database. Also, it is shown the ability to insert and view logs using Elasticsearch, Logstash, and Kibana (ELK stack). Tested data ingest by logging into the database using Beats. The obtained results can help implement a system for analysing user activities from social network data based on Elasticsearch as a central component.\",\"PeriodicalId\":19980,\"journal\":{\"name\":\"Path of Science\",\"volume\":\"25 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Path of Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22178/pos.94-13\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Path of Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22178/pos.94-13","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

。本文设计了一个数据存储数据仓库,用于存储从社交网络中收集到的数据。本文描述了使用数据创建索引并选择具有适当数量的分片和副本的配置——集群的主要状态及其扩展的可能性。在处理社交网络帖子中的用户活动数据时,描述了使用非关系Elasticsearch数据库的特性。在社交网络中,我们选择了Facebook和Instagram进行分析。本文描述了与Apache Kafka相比,使用这种数据存储的优点和缺点。分析与Elasticsearch集成的现有数据插入应用程序接口和数据可视化工具。该研究描述了使用Bulk API一次将许多记录插入到数据库中。所设计的数据仓库使用Kibana,这是一种与所选数据库集成的数据可视化和分析工具。此外,还展示了使用Elasticsearch、Logstash和Kibana (ELK堆栈)插入和查看日志的能力。通过使用Beats登录数据库测试数据摄取。获得的结果可以帮助实现基于Elasticsearch作为中心组件的社交网络数据分析用户活动的系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Designing a Data Warehouse for Collected Data About User Activity in Social Networks Using Elasticsearch
. In this paper, a data storage data warehouse is designed to store collected data from social networks. Creating indexes with data and selecting a configuration with the appropriate number of shards and replicas is described – the primary states of the cluster and possibilities of its scaling. The features of working with the non-relational Elasticsearch database are described when working with data on user activity in social network posts. Among social networks, Facebook and Instagram were chosen for analysis. The paper describes the advantages and disadvantages of using such a data store compared to Apache Kafka. Analysed existing data insertion Application Program Interfaces (APIs) and data visualisation tools integrated with Elasticsearch. The study describes the use of the Bulk API to insert many records at once into a database. The designed data warehouse uses Kibana, a data visualisation and analytics tool integrated with the selected database. Also, it is shown the ability to insert and view logs using Elasticsearch, Logstash, and Kibana (ELK stack). Tested data ingest by logging into the database using Beats. The obtained results can help implement a system for analysing user activities from social network data based on Elasticsearch as a central component.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信