International Journal of Data Warehousing and Mining最新文献

筛选
英文 中文
Fusing Syntax and Semantics-Based Graph Convolutional Network for Aspect-Based Sentiment Analysis 融合语法和语义的图卷积网络用于面向方面的情感分析
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2023-03-17 DOI: 10.4018/ijdwm.319803
Jinhui Feng, Shaohua Cai, Kuntao Li, Yifan Chen, Qianhua Cai, Hongya Zhao
{"title":"Fusing Syntax and Semantics-Based Graph Convolutional Network for Aspect-Based Sentiment Analysis","authors":"Jinhui Feng, Shaohua Cai, Kuntao Li, Yifan Chen, Qianhua Cai, Hongya Zhao","doi":"10.4018/ijdwm.319803","DOIUrl":"https://doi.org/10.4018/ijdwm.319803","url":null,"abstract":"Aspect-based sentiment analysis (ABSA) aims to classify the sentiment polarity of a given aspect in a sentence or document, which is a fine-grained task of natural language processing. Recent ABSA methods mainly focus on exploiting the syntactic information, the semantic information and both. Research on cognition theory reveals that the syntax an*/874d the semantics have effects on each other. In this work, a graph convolutional network-based model that fuses the syntactic information and semantic information in line with the cognitive practice is proposed. To start with, the GCN is taken to extract syntactic information on the syntax dependency tree. Then, the semantic graph is constructed via a multi-head self-attention mechanism and encoded by GCN. Furthermore, a parameter-sharing GCN is developed to capture the common information between the semantics and the syntax. Experiments conducted on three benchmark datasets (Laptop14, Restaurant14 and Twitter) validate that the proposed model achieves compelling performance comparing with the state-of-the-art models.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"22 1","pages":"1-15"},"PeriodicalIF":1.2,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85826520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CTNRL: A Novel Network Representation Learning With Three Feature Integrations CTNRL:一种新颖的三特征集成网络表示学习方法
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2023-03-03 DOI: 10.4018/ijdwm.318696
Yanlong Tang, Zhonglin Ye, Haixing Zhao, Yi Ji
{"title":"CTNRL: A Novel Network Representation Learning With Three Feature Integrations","authors":"Yanlong Tang, Zhonglin Ye, Haixing Zhao, Yi Ji","doi":"10.4018/ijdwm.318696","DOIUrl":"https://doi.org/10.4018/ijdwm.318696","url":null,"abstract":"Network representation learning is one of the important works of analyzing network information. Its purpose is to learn a vector for each node in the network and map it into the vector space, and the resulting number of node dimensions is much smaller than the number of nodes in the network. Most of the current work only considers local features and ignores other features in the network, such as attribute features. Aiming at such problems, this paper proposes novel mechanisms of combining network topology, which models node text information and node clustering information on the basis of network structure and then constrains the learning process of network representation to obtain the optimal network node vector. The method is experimentally verified on three datasets: Citeseer (M10), DBLP (V4), and SDBLP. Experimental results show that the proposed method is better than the algorithm based on network topology and text feature. Good experimental results are obtained, which verifies the feasibility of the algorithm and achieves the expected experimental results.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"19 1","pages":"1-14"},"PeriodicalIF":1.2,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70455633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on Rumor Detection Based on a Graph Attention Network With Temporal Features 基于时间特征的图注意网络谣言检测研究
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2023-03-02 DOI: 10.4018/ijdwm.319342
Xiaohui Yang, Hailong Ma, Miao Wang
{"title":"Research on Rumor Detection Based on a Graph Attention Network With Temporal Features","authors":"Xiaohui Yang, Hailong Ma, Miao Wang","doi":"10.4018/ijdwm.319342","DOIUrl":"https://doi.org/10.4018/ijdwm.319342","url":null,"abstract":"The higher-order and temporal characteristics of tweet sequences are often ignored in the field of rumor detection. In this paper, a new rumor detection method (T-BiGAT) is proposed to capture the temporal features between tweets by combining a graph attention network (GAT) and gated recurrent neural network (GRU). First, timestamps are calculated for each tweet within the same event. On the premise of the same timestamp, two different propagation subgraphs are constructed according to the response relationship between tweets. Then, GRU is used to capture intralayer dependencies between sibling nodes in the subtree; global features of each subtree are extracted using an improved GAT. Furthermore, GRU is reused to capture the temporal dependencies of individual subgraphs at different timestamps. Finally, weights are assigned to the global feature vectors of different timestamp subtrees for aggregation, and a mapping function is used to classify the aggregated vectors.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"47 1","pages":"1-17"},"PeriodicalIF":1.2,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86331233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clustering of COVID-19 Multi-Time Series-Based K-Means and PCA With Forecasting 基于多时间序列的k均值聚类与PCA预测
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2023-02-03 DOI: 10.4018/ijdwm.317374
Sundus Naji Alaziz, Bakr Albayati, A. A. El-Bagoury, Wasswa Shafik
{"title":"Clustering of COVID-19 Multi-Time Series-Based K-Means and PCA With Forecasting","authors":"Sundus Naji Alaziz, Bakr Albayati, A. A. El-Bagoury, Wasswa Shafik","doi":"10.4018/ijdwm.317374","DOIUrl":"https://doi.org/10.4018/ijdwm.317374","url":null,"abstract":"The COVID-19 pandemic is one of the current universal threats to humanity. The entire world is cooperating persistently to find some ways to decrease its effect. The time series is one of the basic criteria that play a fundamental part in developing an accurate prediction model for future estimations regarding the expansion of this virus with its infective nature. The authors discuss in this paper the goals of the study, problems, definitions, and previous studies. Also they deal with the theoretical aspect of multi-time series clusters using both the K-means and the time series cluster. In the end, they apply the topics, and ARIMA is used to introduce a prototype to give specific predictions about the impact of the COVID-19 pandemic from 90 to 140 days. The modeling and prediction process is done using the available data set from the Saudi Ministry of Health for Riyadh, Jeddah, Makkah, and Dammam during the previous four months, and the model is evaluated using the Python program. Based on this proposed method, the authors address the conclusions.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"3 1","pages":"1-25"},"PeriodicalIF":1.2,"publicationDate":"2023-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90155954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Combining BPSO and ELM Models for Inferring Novel lncRNA-Disease Associations 结合BPSO和ELM模型推断新的lncrna -疾病关联
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2023-01-20 DOI: 10.4018/ijdwm.317092
W. Yang, Xianghan Zheng, Qiongxia Huang, Yu Liu, Yimi Chen, ZhiGang Song
{"title":"Combining BPSO and ELM Models for Inferring Novel lncRNA-Disease Associations","authors":"W. Yang, Xianghan Zheng, Qiongxia Huang, Yu Liu, Yimi Chen, ZhiGang Song","doi":"10.4018/ijdwm.317092","DOIUrl":"https://doi.org/10.4018/ijdwm.317092","url":null,"abstract":"It has been widely known that long non-coding RNA (lncRNA) plays an important role in gene expression and regulation. However, due to a few characteristics of lncRNA (e.g., huge amounts of data, high dimension, lack of noted samples, etc.), identifying key lncRNA closely related to specific disease is nearly impossible. In this paper, the authors propose a computational method to predict key lncRNA closely related to its corresponding disease. The proposed solution implements a BPSO based intelligent algorithm to select possible optimal lncRNA subset, and then uses ML-ELM based deep learning model to evaluate each lncRNA subset. After that, wrapper feature extraction method is used to select lncRNAs, which are closely related to the pathophysiology of disease from massive data. Experimentation on three typical open datasets proves the feasibility and efficiency of our proposed solution. This proposed solution achieves above 93% accuracy, the best ever.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"14 1","pages":"1-18"},"PeriodicalIF":1.2,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75810785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Spatiotemporal Data Prediction Model Based on a Multi-Layer Attention Mechanism 基于多层注意机制的时空数据预测模型
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2023-01-16 DOI: 10.4018/ijdwm.315822
Man Jiang, Qilong Han, Haitao Zhang, Hexiang Liu
{"title":"Spatiotemporal Data Prediction Model Based on a Multi-Layer Attention Mechanism","authors":"Man Jiang, Qilong Han, Haitao Zhang, Hexiang Liu","doi":"10.4018/ijdwm.315822","DOIUrl":"https://doi.org/10.4018/ijdwm.315822","url":null,"abstract":"Spatiotemporal data prediction is of great significance in the fields of smart cities and smart manufacturing. Current spatiotemporal data prediction models heavily rely on traditional spatial views or single temporal granularity, which suffer from missing knowledge, including dynamic spatial correlations, periodicity, and mutability. This paper addresses these challenges by proposing a multi-layer attention-based predictive model. The key idea of this paper is to use a multi-layer attention mechanism to model the dynamic spatial correlation of different features. Then, multi-granularity historical features are fused to predict future spatiotemporal data. Experiments on real-world data show that the proposed model outperforms six state-of-the-art benchmark methods.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"8 1","pages":"1-15"},"PeriodicalIF":1.2,"publicationDate":"2023-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82940997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Outlier Detection Algorithm Based on Fast Density Peak Clustering Outlier Factor 基于快速密度峰聚类离群因子的离群点检测新算法
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2023-01-13 DOI: 10.4018/ijdwm.316534
Zhongping Zhang, Sen Li, Weixiong Liu, Y. Wang, Daisy Xin Li
{"title":"A New Outlier Detection Algorithm Based on Fast Density Peak Clustering Outlier Factor","authors":"Zhongping Zhang, Sen Li, Weixiong Liu, Y. Wang, Daisy Xin Li","doi":"10.4018/ijdwm.316534","DOIUrl":"https://doi.org/10.4018/ijdwm.316534","url":null,"abstract":"Outlier detection is an important field in data mining, which can be used in fraud detection, fault detection, and other fields. This article focuses on the problem where the density peak clustering algorithm needs a manual parameter setting and time complexity is high; the first is to use the k nearest neighbors clustering algorithm to replace the density peak of the density estimate, which adopts the KD-Tree index data structure calculation of data objects k close neighbors. Then it adopts the method of the product of density and distance automatic selection of clustering centers. In addition, the central relative distance and fast density peak clustering outliers were defined to characterize the degree of outliers of data objects. Then, based on fast density peak clustering outliers, an outlier detection algorithm was devised. Experiments on artificial and real data sets are performed to validate the algorithm, and the validity and time efficiency of the proposed algorithm are validated when compared to several conventional and innovative algorithms.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"21 1","pages":"1-19"},"PeriodicalIF":1.2,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78453444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Estimating the Number of Clusters in High-Dimensional Large Datasets 高维大型数据集中聚类数量的估计
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2023-01-13 DOI: 10.4018/ijdwm.316142
Xutong Zhu, Lingli Li
{"title":"Estimating the Number of Clusters in High-Dimensional Large Datasets","authors":"Xutong Zhu, Lingli Li","doi":"10.4018/ijdwm.316142","DOIUrl":"https://doi.org/10.4018/ijdwm.316142","url":null,"abstract":"Clustering is a basic primer of exploratory tasks. In order to obtain valuable results, the parameters in the clustering algorithm, the number of clusters must be set appropriately. Existing methods for determining the number of clusters perform well on low-dimensional small datasets, but how to effectively determine the optimal number of clusters on large high-dimensional datasets is still a challenging problem. In this paper, the authors design a method for effectively estimating the optimal number of clusters on large-scale high-dimensional datasets that can overcome the shortcomings of existing estimation methods and accurately and quickly estimate the optimal number of clusters on large-scale high-dimensional datasets. Extensive experiments show that it (1) outperforms existing estimation methods in accuracy and efficiency, (2) generalizes across different datasets, and (3) is suitable for high-dimensional large datasets.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"42 1","pages":"1-14"},"PeriodicalIF":1.2,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88955315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Ensemble Approach for Prediction of Cardiovascular Disease Using Meta Classifier Boosting Algorithms 基于Meta-Classifier-Boosting算法的心血管疾病综合预测方法
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2023-01-13 DOI: 10.4018/ijdwm.316145
Sibo Prasad Patro, Neelamadhab Padhy, Rahul Deo Sah
{"title":"An Ensemble Approach for Prediction of Cardiovascular Disease Using Meta Classifier Boosting Algorithms","authors":"Sibo Prasad Patro, Neelamadhab Padhy, Rahul Deo Sah","doi":"10.4018/ijdwm.316145","DOIUrl":"https://doi.org/10.4018/ijdwm.316145","url":null,"abstract":"There are very few studies are carried for investigating the potential of hybrid ensemble machine learning techniques for building a model for the detection and prediction of heart disease in the human body. In this research, the authors deal with a classification problem that is a hybridization of fusion-based ensemble model with machine learning approaches, which produces a more trustworthy ensemble than the original ensemble model and outperforms previous heart disease prediction models. The proposed model is evaluated on the Cleveland heart disease dataset using six boosting techniques named XGBoost, AdaBoost, Gradient Boosting, LightGBM, CatBoost, and Histogram-Based Gradient Boosting. Hybridization produces superior results under consideration of classification algorithms. The remarkable accuracies of 96.51% for training and 93.37% for testing have been achieved by the Meta-XGBoost classifier.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":" ","pages":""},"PeriodicalIF":1.2,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45477411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Efficient Association Rule Mining-Based Spatial Keyword Index 基于关联规则挖掘的高效空间关键字索引
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2023-01-13 DOI: 10.4018/ijdwm.316161
Lianyin Jia, Haotian Tang, Mengjuan Li, Bingxin Zhao, S. Wei, Haihe Zhou
{"title":"An Efficient Association Rule Mining-Based Spatial Keyword Index","authors":"Lianyin Jia, Haotian Tang, Mengjuan Li, Bingxin Zhao, S. Wei, Haihe Zhou","doi":"10.4018/ijdwm.316161","DOIUrl":"https://doi.org/10.4018/ijdwm.316161","url":null,"abstract":"Spatial keyword query has attracted the attention of many researchers. Most of the existing spatial keyword indexes do not consider the differences in keyword distribution, so their efficiencies are not high when data are skewed. To this end, this paper proposes a novel association rule mining based spatial keyword index, ARM-SQ, whose inverted lists are materialized by the frequent item sets mined by association rules; thus, intersections of long lists can be avoided. To prevent excessive space costs caused by materialization, a depth-based materialization strategy is introduced, which maintains a good balance between query and space costs. To select the right frequent item sets for answering a query, the authors further implement a benefit-based greedy frequent item set selection algorithm, BGF-Selection. The experimental results show that this algorithm significantly outperforms the existing algorithms, and its efficiency can be an order of magnitude higher than SFC-Quad.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":"32 1","pages":"1-19"},"PeriodicalIF":1.2,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88182495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信