Big data analytics最新文献_第2页

Cyberpsychology: A Longitudinal Analysis of Cyber Adversarial Tactics and Techniques 网络心理学:网络对抗战术和技术的纵向分析

Big data analytics Pub Date : 2023-08-11 DOI: 10.3390/analytics2030035

Marshall S. Rich

{"title":"Cyberpsychology: A Longitudinal Analysis of Cyber Adversarial Tactics and Techniques","authors":"Marshall S. Rich","doi":"10.3390/analytics2030035","DOIUrl":"https://doi.org/10.3390/analytics2030035","url":null,"abstract":"The rapid proliferation of cyberthreats necessitates a robust understanding of their evolution and associated tactics, as found in this study. A longitudinal analysis of these threats was conducted, utilizing a six-year data set obtained from a deception network, which emphasized its significance in the study’s primary aim: the exhaustive exploration of the tactics and strategies utilized by cybercriminals and how these tactics and techniques evolved in sophistication and target specificity over time. Different cyberattack instances were dissected and interpreted, with the patterns behind target selection shown. The focus was on unveiling patterns behind target selection and highlighting recurring techniques and emerging trends. The study’s methodological design incorporated data preprocessing, exploratory data analysis, clustering and anomaly detection, temporal analysis, and cross-referencing. The validation process underscored the reliability and robustness of the findings, providing evidence of increasingly sophisticated, targeted cyberattacks. The work identified three distinct network traffic behavior clusters and temporal attack patterns. A validated scoring mechanism provided a benchmark for network anomalies, applicable for predictive analysis and facilitating comparative study of network behaviors. This benchmarking aids organizations in proactively identifying and responding to potential threats. The study significantly contributed to the cybersecurity discourse, offering insights that could guide the development of more effective defense strategies. The need for further investigation into the nature of detected anomalies was acknowledged, advocating for continuous research and proactive defense strategies in the face of the constantly evolving landscape of cyberthreats.","PeriodicalId":93078,"journal":{"name":"Big data analytics","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87463876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prediction of Stroke Disease with Demographic and Behavioural Data Using Random Forest Algorithm 基于随机森林算法的人口统计学和行为学数据预测中风疾病

Big data analytics Pub Date : 2023-08-02 DOI: 10.3390/analytics2030034

O. Shobayo, Oluwafemi Zachariah, M. Odusami, Bayode Ogunleye

引用次数: 0

Identification of Patterns in the Stock Market through Unsupervised Algorithms 通过无监督算法识别股票市场的模式

Big data analytics Pub Date : 2023-07-27 DOI: 10.3390/analytics2030033

Adrian Barradas, R. Cantón-Croda, D. Gibaja-Romero

引用次数: 0

Streamflow Estimation through Coupling of Hieararchical Clustering Analysis and Regression Analysis—A Case Study in Euphrates-Tigris Basin 基于层次聚类分析和回归分析的河流流量估算——以幼发拉底河流域为例

Big data analytics Pub Date : 2023-07-13 DOI: 10.3390/analytics2030032

Goksel Ezgi Guzey, Bihrat Onoz

{"title":"Streamflow Estimation through Coupling of Hieararchical Clustering Analysis and Regression Analysis—A Case Study in Euphrates-Tigris Basin","authors":"Goksel Ezgi Guzey, Bihrat Onoz","doi":"10.3390/analytics2030032","DOIUrl":"https://doi.org/10.3390/analytics2030032","url":null,"abstract":"In this study, the resilience of designed water systems in the face of limited streamflow gauging stations and escalating global warming impacts were investigated. By performing a regression analysis, simulated meteorological data with observed streamflow from 1971 to 2020 across 33 stream gauging stations in the Euphrates-Tigris Basin were correlated. Utilizing the Ordinary Least Squares regression method, streamflow for 2020–2100 using simulated meteorological data under RCP 4.5 and RCP 8.5 scenarios in CORDEX-EURO and CORDEX-MENA domains were also predicted. Streamflow variability was calculated based on meteorological variables and station morphological characteristics, particularly evapotranspiration. Hierarchical clustering analysis identified two clusters among the stream gauging stations, and for each cluster, two streamflow equations were derived. The regression analysis achieved robust streamflow predictions using six representative climate variables, with adj. R2 values of 0.7–0.85 across all models, primarily influenced by evapotranspiration. The use of a global model led to a 10% decrease in prediction capabilities for all CORDEX models based on R2 performance. This study emphasizes the importance of region homogeneity in estimating streamflow, encompassing both geographical and hydro-meteorological characteristics.","PeriodicalId":93078,"journal":{"name":"Big data analytics","volume":"81 1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80978870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hierarchical Model-Based Deep Reinforcement Learning for Single-Asset Trading 基于层次模型的单资产交易深度强化学习

Big data analytics Pub Date : 2023-07-11 DOI: 10.3390/analytics2030031

Adrian Millea

{"title":"Hierarchical Model-Based Deep Reinforcement Learning for Single-Asset Trading","authors":"Adrian Millea","doi":"10.3390/analytics2030031","DOIUrl":"https://doi.org/10.3390/analytics2030031","url":null,"abstract":"We present a hierarchical reinforcement learning (RL) architecture that employs various low-level agents to act in the trading environment, i.e., the market. The highest-level agent selects from among a group of specialized agents, and then the selected agent decides when to sell or buy a single asset for a period of time. This period can be variable according to a termination function. We hypothesized that, due to different market regimes, more than one single agent is needed when trying to learn from such heterogeneous data, and instead, multiple agents will perform better, with each one specializing in a subset of the data. We use k-meansclustering to partition the data and train each agent with a different cluster. Partitioning the input data also helps model-based RL (MBRL), where models can be heterogeneous. We also add two simple decision-making models to the set of low-level agents, diversifying the pool of available agents, and thus increasing overall behavioral flexibility. We perform multiple experiments showing the strengths of a hierarchical approach and test various prediction models at both levels. We also use a risk-based reward at the high level, which transforms the overall problem into a risk-return optimization. This type of reward shows a significant reduction in risk while minimally reducing profits. Overall, the hierarchical approach shows significant promise, especially when the pool of low-level agents is highly diverse. The usefulness of such a system is clear, especially for human-devised strategies, which could be incorporated in a sound manner into larger, powerful automatic systems.","PeriodicalId":93078,"journal":{"name":"Big data analytics","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86190977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

occams: A Text Summarization Package occams:文本摘要包

Big data analytics Pub Date : 2023-06-30 DOI: 10.3390/analytics2030030

Clinton T. White, Neil P. Molino, Julia S. Yang, John M. Conroy

引用次数: 1

Bayesian Mixture Copula Estimation and Selection with Applications 贝叶斯混合Copula估计与选择及其应用

Big data analytics Pub Date : 2023-06-15 DOI: 10.3390/analytics2020029

Yujian Liu, Dejun Xie, Siyi Yu

引用次数: 3

Preliminary Perspectives on Information Passing in the Intelligence Community 情报界信息传递的初步展望

Big data analytics Pub Date : 2023-06-15 DOI: 10.3390/analytics2020028

Jeremy E. Block, Ilana Bookner, S. Chu, R. J. Crouser, Donald R. Honeycutt, Rebecca M. Jonas, Abhishek Kulkarni, Yancy Vance M. Paredes, E. Ragan

{"title":"Preliminary Perspectives on Information Passing in the Intelligence Community","authors":"Jeremy E. Block, Ilana Bookner, S. Chu, R. J. Crouser, Donald R. Honeycutt, Rebecca M. Jonas, Abhishek Kulkarni, Yancy Vance M. Paredes, E. Ragan","doi":"10.3390/analytics2020028","DOIUrl":"https://doi.org/10.3390/analytics2020028","url":null,"abstract":"Analyst sensemaking research typically focuses on individual or small groups conducting intelligence tasks. This has helped understand information retrieval tasks and how people communicate information. As a part of the grand challenge of the Summer Conference on Applied Data Science (SCADS) to build a system that can generate tailored daily reports (TLDR) for intelligence analysts, we conducted a qualitative interview study with analysts to increase understanding of information passing in the intelligence community. While our results are preliminary, we expect that this work will contribute to a better understanding of the information ecosystem of the intelligence community, how institutional dynamics affect information passing, and what implications this has for a TLDR system. This work describes our involvement in and work completed during SCADS. Although preliminary, we identify that information passing is both a formal and informal process and often follows professional networks due especially to the small population and specialization of work. We call attention to the need for future analysis of information ecosystems to better support tailored information retrieval features.","PeriodicalId":93078,"journal":{"name":"Big data analytics","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81962561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spatiotemporal Data Mining Problems and Methods 时空数据挖掘问题与方法

Big data analytics Pub Date : 2023-06-14 DOI: 10.3390/analytics2020027

Eleftheria Koutsaki, George Vardakis, N. Papadakis

{"title":"Spatiotemporal Data Mining Problems and Methods","authors":"Eleftheria Koutsaki, George Vardakis, N. Papadakis","doi":"10.3390/analytics2020027","DOIUrl":"https://doi.org/10.3390/analytics2020027","url":null,"abstract":"Many scientific fields show great interest in the extraction and processing of spatiotemporal data, such as medicine with an emphasis on epidemiology and neurology, geology, social sciences, meteorology, and a great interest is also observed in the study of transport. Spatiotemporal data differ significantly from spatial data, since spatiotemporal data refer to measurements, which take into account both the place and the time in which they are received, with their respective characteristics, while spatial data refer to and describe information related only to place. The innovation brought about by spatiotemporal data mining has caused a revolution in many scientific fields, and this is because through it we can now provide solutions and answers to complex problems, as well as provide useful and valuable predictions, through predictive learning. However, combining time and place in data mining presents significant challenges and difficulties that must be overcome. Spatiotemporal data mining and analysis is a relatively new approach to data mining which has been studied more systematically in the last decade. The purpose of this article is to provide a good introduction to spatiotemporal data, and through this detailed description, we attempt to introduce descriptive logic and gain a complete knowledge of these data. We aim to introduce a new way of describing them, aiming for future studies, by combining the expressions that arise by type of data, using descriptive logic, with new expressions, that can be derived, to describe future states of objects and environments with great precision, providing accurate predictions. In order to highlight the value of spatiotemporal data, we proceed to give a brief description of ST data in the introduction. We describe the relevant work carried out to date, the types of spatiotemporal (ST) data, their properties and the transformations that can be made between them, attempting, to a small extent, to introduce constraints and rules using descriptive logic, introducing descriptive logic into spatiotemporal data by type, when initially presenting the ST data. The data snapshots by species and similarities between the cases are then described. We describe methods, introducing clustering, dynamic ST clusters, predictive learning, pattern mining frequency, and pattern emergence, and problems such as anomaly detection, identifying time points of changes in the behavior of the observed object, and development of relationships between them. We describe the application of ST data in various fields today, as well as the future work. We finally conclude with our conclusions, with the representation and study of spatiotemporal data can, in combination with other properties which accompany all natural phenomena, through their appropriate processing, lead to safe conclusions regarding the study of problems, and also with great precision in the extraction of predictions by accurately determining future states of an environmen","PeriodicalId":93078,"journal":{"name":"Big data analytics","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91169182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Novel Zero-Truncated Katz Distribution by the Lagrange Expansion of the Second Kind with Associated Inferences 用带关联推论的第二类拉格朗日展开的一种新的零截尾Katz分布

Big data analytics Pub Date : 2023-06-01 DOI: 10.3390/analytics2020026

D. S. Shibu, C. Chesneau, M. Monisha, R. Maya, M. Irshad

{"title":"A Novel Zero-Truncated Katz Distribution by the Lagrange Expansion of the Second Kind with Associated Inferences","authors":"D. S. Shibu, C. Chesneau, M. Monisha, R. Maya, M. Irshad","doi":"10.3390/analytics2020026","DOIUrl":"https://doi.org/10.3390/analytics2020026","url":null,"abstract":"In this article, the Lagrange expansion of the second kind is used to generate a novel zero-truncated Katz distribution; we refer to it as the Lagrangian zero-truncated Katz distribution (LZTKD). Notably, the zero-truncated Katz distribution is a special case of this distribution. Along with the closed form expression of all its statistical characteristics, the LZTKD is proven to provide an adequate model for both underdispersed and overdispersed zero-truncated count datasets. Specifically, we show that the associated hazard rate function has increasing, decreasing, bathtub, or upside-down bathtub shapes. Moreover, we demonstrate that the LZTKD belongs to the Lagrangian distribution of the first kind. Then, applications of the LZTKD in statistical scenarios are explored. The unknown parameters are estimated using the well-reputed method of the maximum likelihood. In addition, the generalized likelihood ratio test procedure is applied to test the significance of the additional parameter. In order to evaluate the performance of the maximum likelihood estimates, simulation studies are also conducted. The use of real-life datasets further highlights the relevance and applicability of the proposed model.","PeriodicalId":93078,"journal":{"name":"Big data analytics","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89251864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0