2019 IEEE International Conference on Big Knowledge (ICBK)最新文献_第5页

A Two-Stage Clustering Algorithm Based on Improved K-Means and Density Peak Clustering 基于改进K-Means和密度峰聚类的两阶段聚类算法

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00047

Na Xiao, Xu Zhou, Xin Huang, Zhibang Yang

{"title":"A Two-Stage Clustering Algorithm Based on Improved K-Means and Density Peak Clustering","authors":"Na Xiao, Xu Zhou, Xin Huang, Zhibang Yang","doi":"10.1109/ICBK.2019.00047","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00047","url":null,"abstract":"The density peak clustering algorithm (DPC) has been widely concerned by researchers since it was proposed. Its advantage lies in its ability to achieve efficient clustering based on two simple assumptions. In DPC, a key step is to manually select the cluster centers according to the decision graph. The quality of the decision graph determines the quality of the selected cluster centers and the quality of the clustering result. The quality of the decision graph is determined by the parameter dc. Although the authors have proposed an empirical parameter selection method, this method does not work well in many real-world datasets. Therefore, in these data sets, the user needs to repeatedly adjust the parameter multiple times to get a good decision graph. Thus, manually selecting cluster centers is not an easy task. In this paper, combined with the clustering idea of K-means and DPC, we propose a two-stage clustering algorithm KDPC that can automatically acquire the cluster centers. In the first stage, KDPC uses an improved K-means algorithm to obtain high quality cluster centers. In the second stage, KDPC clusters the remaining data points according to the clustering idea of DPC. Experiments show that KDPC can achieve good clustering effect in both artificial data sets and real-world data sets. In addition, compared with DPC, KDPC can show better clustering effect in data sets with significant difference in density of clusters.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114065690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Consensus with Voting Theory in Blockchain Environments 区块链环境下投票理论的共识

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00028

Lei Li, Yongkang Jiang, Guanfeng Liu

引用次数: 1

CEPV: A Tree Structure Information Extraction and Visualization Tool for Big Knowledge Graph CEPV:面向大知识图谱的树形结构信息提取与可视化工具

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00037

Shaojing Sheng, Peng Zhou, Xindong Wu

{"title":"CEPV: A Tree Structure Information Extraction and Visualization Tool for Big Knowledge Graph","authors":"Shaojing Sheng, Peng Zhou, Xindong Wu","doi":"10.1109/ICBK.2019.00037","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00037","url":null,"abstract":"A large amount of data with rich semantic and structural information has been accumulated in many real-world applications. In order to effectively describe the concepts and connections in these data sets, knowledge graph was proposed as a tool to handle it. The genealogy is a typical tree structure data and can be stored in the knowledge graph. However, due to the complexity and increasing volume of the data, how to efficiently extract and visualize the customized information from the big knowledge graph is hence a challenge and worthy of in-depth study. Motivated by this, we propose a novel user-specified information extraction and visualization tool, named CEPV (the Customized information Extracting, Processing and Visualization tool), for converting the big graph structure data into a specified tree structure display. The main steps of CEPV are as follows: firstly, according to the requirements of users, extracting the specified data from the massive, complex, heterogeneous data as fewer times as possible, which can reduce the frequency of database access and improve the overall efficiency of the algorithm. Secondly, the fault tolerance mechanism and attribute judgment rules are executed to ensure the correctness during the data processing. Finally, the processed data with a complex relationship is presented to the user in multiple visualization models. The high availability and effectiveness of our proposed tool is verified on a big knowledge graph dataset.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129687876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

An Efficient Application Traffic Signature Generation System 一种高效的应用流量签名生成系统

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00053

Yuanming Zhang, Ting Han, Zelin Hao, Yu Cao, Jing Tao

{"title":"An Efficient Application Traffic Signature Generation System","authors":"Yuanming Zhang, Ting Han, Zelin Hao, Yu Cao, Jing Tao","doi":"10.1109/ICBK.2019.00053","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00053","url":null,"abstract":"Application traffic signatures are byte subsequences or behaviors (such as packet sizes and interval times) within traffic that can distinguish which application is contributing to the network traffic, application traffic signatures form the building blocks of many constructions of deep packet analysis rules in numerous areas, such as network management, measurement, and even security systems. Under the pressure of the continual appearance of new applications and their frequent updates, how to efficiently and accurately extract signatures from network traffic becomes a more challenging issue. Although several generating methods have been proposed, because of the problems of efficiency, robustness, and refinement, the application of these methods in real network environments still has limitations. Existing CS (Common Subsequence) based approaches are ineffective in generating signatures from network traffic, especially when the network traffic is massive. In this paper, we propose ESGS, an efficient system to extracts signatures from application traffic traces. ESGS base on the Latent Dirichlet Allocation (LDA) and a modified sequence pattern algorithm. First, we use a semantic analysis algorithm based on the LDA to select the candidate packet from the traffic traces according to the semantic information of the packet and refine the traffic traces. Then, we use a modified sequence pattern algorithm to generate signatures in the filtered traffic trace. We compare ESGS with several existing generating methods via evaluation on real-world application traffic traces. The result shows that ESGS can generate application traffic signatures significantly faster, and the signatures perform high accuracy. In addition, this method can effectively reduce the input traffic of signature generation systems such as Sigbox, and significantly improve the efficiency of signature generation while having a little impact on accuracy.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125387401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Three-Dimensional Convolutional-Recurrent Network for Convective Storm Nowcasting 对流风暴临近预报的三维卷积循环网络

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-10-01 DOI: 10.1109/ICBK.2019.00052

W. Zhang, Wei Li, Lei Han

{"title":"A Three-Dimensional Convolutional-Recurrent Network for Convective Storm Nowcasting","authors":"W. Zhang, Wei Li, Lei Han","doi":"10.1109/ICBK.2019.00052","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00052","url":null,"abstract":"Very short-term convective storm forecasting, termed nowcasting, has long been an important issue and has attracted substantial interest. Existing nowcasting methods rely principally on radar images and are limited in terms of nowcasting storm initiation and growth. Real-time re-analysis of meteorological data supplied by numerical models provides valuable information about three-dimensional (3D), atmospheric, boundary layer thermal dynamics, such as temperature and wind. To mine such data, we here develop a convolution-recurrent, hybrid deep-learning method with the following characteristics: (1) the use of cell-based oversampling to increase the number of training samples; this mitigates the class imbalance issue; (2) the use of both raw 3D radar data and 3D meteorological data re-analyzed via multi-source 3D convolution without any need for handcraft feature engineering; and (3) the stacking of convolutional neural networks on a long short-term memory encoder/decoder that learns the spatiotemporal patterns of convective processes. Experimental results demonstrated that our method performs better than other extrapolation methods. Qualitative analysis yielded encouraging nowcasting results.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124466322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Tensor-Train Parameterization for Ultra Dimensionality Reduction 超降维的张量-列参数化

2019 IEEE International Conference on Big Knowledge (ICBK) Pub Date : 2019-08-14 DOI: 10.1109/ICBK.2019.00011

Mingyuan Bai, S. Choy, Xin Song, Junbin Gao

{"title":"Tensor-Train Parameterization for Ultra Dimensionality Reduction","authors":"Mingyuan Bai, S. Choy, Xin Song, Junbin Gao","doi":"10.1109/ICBK.2019.00011","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00011","url":null,"abstract":"Dimensionality reduction is a conventional yet crucial field in machine learning. In dimensionality reduction, locality preserving projections (LPP) are a vital method designed to avoid the sensitivity to outliers based on data graph information. However, in terms of the extreme outliers, the performance of LPP is still largely undermined by them. For the case when the input data are matrices or tensors, LPP can only process them by flattening them into an extensively long vector and thus result in the loss of structural information. Furthermore, the assumption for LPP is that the dimension of data should be smaller than the number of instances. Therefore, for high-dimensional data analysis, LPP is not appropriate. In this case, the tensor-train decomposition comes to the stage and demonstrates the efficiency and effectiveness to capture these spatial relations. In consequence, a tensor-train parameterization for ultra dimensionality reduction (TTPUDR) is proposed in this paper, where the conventional LPP mapping is tensorized through tensor-trains and the objective function in the traditional LPP is substituted with the Frobenius norm instead of the squared Frobenius norm to enhance the robustness of the model. We also utilize the manifold optimization to assist the learning process of the model. We evaluate the performance of TTPUDR on classification problems versus the state-of-the-art methods and the past axiomatic methods and TTPUDR significantly outperforms them.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115880019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0