International Journal of Data Warehousing and Mining最新文献

筛选
英文 中文
Enhancing the Diamond Document Warehouse Model 改进钻石文档仓库模型
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2020-10-01 DOI: 10.4018/ijdwm.2020100101
M. Azabou, Ameen Banjar, J. Feki
{"title":"Enhancing the Diamond Document Warehouse Model","authors":"M. Azabou, Ameen Banjar, J. Feki","doi":"10.4018/ijdwm.2020100101","DOIUrl":"https://doi.org/10.4018/ijdwm.2020100101","url":null,"abstract":"The data warehouse community has paid particular attention to the document warehouse (DocW) paradigm during the last two decades. However, some important issues related to the semantics are still pending and therefore need a deep research investigation. Indeed, the semantic exploitation of the DocW is not yet mature despite it representing a main concern for decision-makers. This paper aims to enhancing the multidimensional model called Diamond Document Warehouse Model with semantics aspects; in particular, it suggests semantic OLAP (on-line analytical processing) operators for querying the DocW.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85634852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Improvement of K-Medoids Clustering Algorithm Based on Fixed Point Iteration 基于不动点迭代的K-Medoids聚类算法改进
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2020-10-01 DOI: 10.4018/ijdwm.2020100105
Xiaodi Huang, Minglun Ren, Zhongfeng Hu
{"title":"An Improvement of K-Medoids Clustering Algorithm Based on Fixed Point Iteration","authors":"Xiaodi Huang, Minglun Ren, Zhongfeng Hu","doi":"10.4018/ijdwm.2020100105","DOIUrl":"https://doi.org/10.4018/ijdwm.2020100105","url":null,"abstract":"The process of K-medoids algorithm is that it first selects data randomly as initial centers to form initial clusters. Then, based on PAM (partitioning around medoids) algorithm, centers will be sequential replaced by all the remaining data to find a result has the best inherent convergence. Since PAM algorithm is an iterative ergodic strategy, when the data size or the number of clusters are huge, its expensive computational overhead will hinder its feasibility. The authors use the fixed-point iteration to search the optimal clustering centers and build a FPK-medoids (fixed point-based K-medoids) algorithm. By constructing fixed point equations for each cluster, the problem of searching optimal centers is converted into the solving of equation set in parallel. The experiment is carried on six standard datasets, and the result shows that the clustering efficiency of proposed algorithm is significantly improved compared with the conventional algorithm. In addition, the clustering quality will be markedly enhanced in handling problems with large-scale datasets or a large number of clusters.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85850552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Data Discovery Over Time Series From Star Schemas Based on Association, Correlation, and Causality 基于关联、相关性和因果关系的星型模式的时间序列数据发现
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2020-10-01 DOI: 10.4018/ijdwm.2020100106
Wallace A. Pinheiro, G. Xexéo, J. Souza, A. B. Pinheiro
{"title":"Data Discovery Over Time Series From Star Schemas Based on Association, Correlation, and Causality","authors":"Wallace A. Pinheiro, G. Xexéo, J. Souza, A. B. Pinheiro","doi":"10.4018/ijdwm.2020100106","DOIUrl":"https://doi.org/10.4018/ijdwm.2020100106","url":null,"abstract":"This work proposes a methodology applied to repositories modeled using star schemas, such as data marts, to discover relevant time series relations. This paper applies a set of measures related to association, correlation, and causality to create connections among data. In this context, the research proposes a new causality function based on peaks and values that relate coherently time series. To evaluate the approach, the authors use a set of experiments exploring time series about a particular neglected disease that affects several Brazilian cities called American Tegumentary Leishmaniasis and time series about the climate of some cities in Brazil. The authors populate data marts with these data, and the proposed methodology has generated a set of relations linking the notifications of this disease to the variation of temperature and pluviometry.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83654658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discovering Specific Sales Patterns Among Different Market Segments 发现不同细分市场的特定销售模式
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2020-07-01 DOI: 10.4018/ijdwm.2020070103
Cheng-Hsiung Weng, Cheng-Kui Huang
{"title":"Discovering Specific Sales Patterns Among Different Market Segments","authors":"Cheng-Hsiung Weng, Cheng-Kui Huang","doi":"10.4018/ijdwm.2020070103","DOIUrl":"https://doi.org/10.4018/ijdwm.2020070103","url":null,"abstract":"Formulating different marketing strategies to apply to various market segments is a noteworthy undertaking for marketing managers. Accordingly, marketing managers should identify sales patterns among different market segments. The study initially applies the concept of recency–frequency–monetary (RFM) scores to segment transaction datasets into several sub-datasets (market segments) and discovers RFM itemsets from these market segments. In addition, three sales features (unique, common, and particular sales patterns) are defined to identify various sales patterns in this study. In particular, a new criterion (contrast support) is also proposed to discover notable sales patterns among different market segments. This study develops an algorithm, called sales pattern mining (SPMING), for discovering RFM itemsets from several RFM-based market segments and then identifying unique, common, and particular sales patterns. The experimental results from two real datasets show that the SPMING algorithm can discover specific sales patterns in various market segments.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81898263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Extending LINE for Network Embedding With Completely Imbalanced Labels 标签完全不平衡的网络嵌入扩展LINE
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2020-07-01 DOI: 10.4018/ijdwm.2020070102
Zheng Wang, Qiao Wang, Tanjie Zhu, Xiaojun Ye
{"title":"Extending LINE for Network Embedding With Completely Imbalanced Labels","authors":"Zheng Wang, Qiao Wang, Tanjie Zhu, Xiaojun Ye","doi":"10.4018/ijdwm.2020070102","DOIUrl":"https://doi.org/10.4018/ijdwm.2020070102","url":null,"abstract":"Network embedding is a fundamental problem in network research. Semi-supervised network embedding, which benefits from labeled data, has recently attracted considerable interest. However, existing semi-supervised methods would get biased results in the completely-imbalanced label setting where labeled data cannot cover all classes. This article proposes a novel network embedding method which could benefit from completely-imbalanced labels by approximately guaranteeing both intra-class similarity and inter-class dissimilarity. In addition, the authors prove and adopt the matrix factorization form of LINE (a famous network embedding method) as the network structure preserving model. Extensive experiments demonstrate the superiority and robustness of this method.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83912304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Novel Method for Classifying Function of Spatial Regions Based on Two Sets of Characteristics Indicated by Trajectories 基于轨迹表示的两组特征的空间区域函数分类新方法
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2020-07-01 DOI: 10.4018/ijdwm.2020070101
Haitao Zhang, Che Yu, Yan Jin
{"title":"A Novel Method for Classifying Function of Spatial Regions Based on Two Sets of Characteristics Indicated by Trajectories","authors":"Haitao Zhang, Che Yu, Yan Jin","doi":"10.4018/ijdwm.2020070101","DOIUrl":"https://doi.org/10.4018/ijdwm.2020070101","url":null,"abstract":"Trajectoryisasignificantfactorforclassifyingfunctionsofspatialregions.Manyspatialclassification methods use trajectories to detect buildings and districts in urban settings. However, methods thatonly take intoconsideration the localspatiotemporalcharacteristics indicatedby trajectories maygenerateinaccurateresults.Inthisarticle,anovelmethodforclassifyingfunctionofspatial regionsbasedontwosetsofcharacteristicsindicatedbytrajectoriesisproposed,inwhichthelocal spatiotemporalcharacteristicsaswellastheglobalconnectioncharacteristicsareobtainedthrough twosetsofcalculations.Themethodwasevaluatedintwoexperiments:onethatmeasuredchanges in theclassificationmetric throughasplits ratiofactor,andone thatcompared theclassification performancebetweentheproposedmethodandmethodsbasedonasinglesetofcharacteristics.The resultsshowedthattheproposedmethodismoreaccuratethanthetwotraditionalmethods,witha precisionvalueof0.93,arecallvalueof0.77,andanF-Measurevalueof0.84. KeyWoRDS Function of Spatial Regions, Global Connection Characteristics, Local Spatiotemporal Characteristics, Spatial Classification, Trajectory","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77056513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Boosting-Aided Adaptive Cluster-Based Undersampling Approach for Treatment of Class Imbalance Problem 一种基于增强辅助自适应聚类的欠采样方法处理类失衡问题
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2020-07-01 DOI: 10.4018/ijdwm.2020070104
D. Devi, S. Namasudra, Seifedine Kadry
{"title":"A Boosting-Aided Adaptive Cluster-Based Undersampling Approach for Treatment of Class Imbalance Problem","authors":"D. Devi, S. Namasudra, Seifedine Kadry","doi":"10.4018/ijdwm.2020070104","DOIUrl":"https://doi.org/10.4018/ijdwm.2020070104","url":null,"abstract":"The subject of a class imbalance is a well-investigated topic which addresses performance degradation of standard learning models due to uneven distribution of classes in a dataspace. Cluster-based undersampling is a popular solution in the domain which offers to eliminate majority class instances from a definite number of clusters to balance the training data. However, distance-based elimination of instances often got affected by the underlying data distribution. Recently, ensemble learning techniques have emerged as effective solution due to its weighted learning principle of rare instances. In this article, a boosting aided adaptive cluster-based undersampling technique is proposed to facilitate elimination of learning- insignificant majority class instances from the clusters, detected through AdaBoost ensemble learning model. The proposed work is validated with seven existing cluster based undersampling techniques for six binary datasets and three classification models. The experimental results have established the effectives of the proposed technique than the existing methods.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81780316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Recommender Systems Using Collaborative Tagging 使用协作标记的推荐系统
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2020-07-01 DOI: 10.4018/ijdwm.2020070110
Latha Banda, Karan Singh, Le Hoang Son, Mohamed Abdel-Basset, Pham Huy Thong, H. Huynh, D. Taniar
{"title":"Recommender Systems Using Collaborative Tagging","authors":"Latha Banda, Karan Singh, Le Hoang Son, Mohamed Abdel-Basset, Pham Huy Thong, H. Huynh, D. Taniar","doi":"10.4018/ijdwm.2020070110","DOIUrl":"https://doi.org/10.4018/ijdwm.2020070110","url":null,"abstract":"Collaborative tagging is a useful and effective way for classifying items with respect to search, sharing information so that users can be tagged via online social networking. This article proposes a novel recommender system for collaborative tagging in which the genre interestingness measure and gradual decay are utilized with diffusion similarity. The comparison has been done on the benchmark recommender system datasets namely MovieLens, Amazon datasets against the existing approaches such as collaborative filtering based on tagging using E-FCM, and E-GK clustering algorithms, hybrid recommender systems based on tagging using GA and collaborative tagging using incremental clustering with trust. The experimental results ensure that the proposed approach achieves maximum prediction accuracy ratio of 9.25% for average of various splits data of 100 users, which is higher than the existing approaches obtained only prediction accuracy of 5.76%.","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87852321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Serialized Co-Training-Based Recognition of Medicine Names for Patent Mining and Retrieval 基于序列化协同训练的药品名称识别专利挖掘与检索
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2020-07-01 DOI: 10.4018/ijdwm.2020070105
Na Deng, Caiquan Xiong
{"title":"Serialized Co-Training-Based Recognition of Medicine Names for Patent Mining and Retrieval","authors":"Na Deng, Caiquan Xiong","doi":"10.4018/ijdwm.2020070105","DOIUrl":"https://doi.org/10.4018/ijdwm.2020070105","url":null,"abstract":"IntheretrievalandminingoftraditionalChinesemedicine(TCM)patents,akeystepisChineseword segmentationandnamedentityrecognition.However,thealiasphenomenonoftraditionalChinese medicinescausesgreatchallengestoChinesewordsegmentationandnamedentityrecognitioninTCM patents,whichdirectlyaffectstheeffectofpatentmining.Becauseofthelackofacomprehensive Chineseherbalmedicinenamethesaurus,traditionalthesaurus-basedChinesewordsegmentation andnamedentityrecognitionarenotsuitableformedicineidentificationinTCMpatents.Inviewof thepresentsituation,usingthelanguagecharacteristicsandstructuralcharacteristicsofTCMpatent texts,amodifiedandserializedco-trainingmethodtorecognizemedicinenamesfromTCMpatent abstract texts isproposed.Experimentsshowthat thismethodcanmaintainhighaccuracyunder relativelylowtimecomplexity.Inaddition,thismethodcanalsobeexpandedtotherecognitionof othernamedentitiesinTCMpatents,suchasdiseasenames,preparationmethods,andsoon. KeyWoRDS Annotation, Co-Training, Machine Learning, Medicine Name, Patent Mining, Patent Retrieval, Traditional Chinese Medicine","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73526987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Integrating Feature and Instance Selection Techniques in Opinion Mining 集成特征和实例选择技术的意见挖掘
IF 1.2 4区 计算机科学
International Journal of Data Warehousing and Mining Pub Date : 2020-07-01 DOI: 10.4018/ijdwm.2020070109
Zi-Hung You, Ya-Han Hu, Chih-Fong Tsai, Yen-Ming Kuo
{"title":"Integrating Feature and Instance Selection Techniques in Opinion Mining","authors":"Zi-Hung You, Ya-Han Hu, Chih-Fong Tsai, Yen-Ming Kuo","doi":"10.4018/ijdwm.2020070109","DOIUrl":"https://doi.org/10.4018/ijdwm.2020070109","url":null,"abstract":"Opinion mining focuses on extracting polarity information from texts. For textual term representation,differentfeatureselectionmethods,e.g.termfrequency(TF)ortermfrequency– inverse document frequency (TF–IDF), can yield diverse numbers of text features. In text classification,however,aselectedtrainingsetmaycontainnoisydocuments(oroutliers),which candegrade theclassificationperformance.Tosolve thisproblem, instanceselectioncanbe adoptedtofilteroutunrepresentativetrainingdocuments.Therefore,thisarticleinvestigatesthe opinionminingperformanceassociatedwithfeatureandinstanceselectionstepssimultaneously. Two combination processes based on performing feature selection and instance selection in differentorders,werecompared.Specifically, twofeatureselectionmethods,namelyTFand TF–IDF, and two instance selection methods, namely DROP3 and IB3, were employed for comparison. The experimental results by using three Twitter datasets to develop sentiment classifiersshowedthatTF–IDFfollowedbyDROP3performsthebest. KeyWORDS Feature Selection, Instance Selection, Opinion Mining, Text Classification","PeriodicalId":54963,"journal":{"name":"International Journal of Data Warehousing and Mining","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91046196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信