{"title":"Graph-based Keyphrase Extraction Using Word and Document Em beddings*","authors":"Xian Zu, Fei Xie, Xiaojian Liu","doi":"10.1109/ICBK50248.2020.00020","DOIUrl":"https://doi.org/10.1109/ICBK50248.2020.00020","url":null,"abstract":"With the increasing amount of text data in applications, the task of keyphrase extraction receives more attention that aims to extract concise and important information from a document. In this paper, we propose a novel graph-based keyphrase extraction method using word and document embedding vectors. Two graph construction schemes named GKE-w and GKE-p are designed in which candidate words and phrases are represented as nodes respectively. By calculating the similarity between a word/phrase and the document, each node is assigned an initial weight that reflects the preference to be a keyphrase. Then, we calculate the score of each candidate word/phrase using a semantic biased random walk strategy. Finally, the Top N scored candidate phrases are selected as the final keyphrases. Experiments on two widely used datasets show that the proposed keyphrase extraction algorithm outperforms the state-of-the-art keyphrase extraction methods in terms of precision, recall, and F1 measures.","PeriodicalId":432857,"journal":{"name":"2020 IEEE International Conference on Knowledge Graph (ICKG)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130172355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ICKG 2020 List Reviewer Page","authors":"Huanhuan Chen","doi":"10.1109/icbk50248.2020.00008","DOIUrl":"https://doi.org/10.1109/icbk50248.2020.00008","url":null,"abstract":"","PeriodicalId":432857,"journal":{"name":"2020 IEEE International Conference on Knowledge Graph (ICKG)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125294507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Adaptive Ant Colony optimization in Knowledge Graphs","authors":"Wei Li, Le Xia, Ying Huang","doi":"10.1109/ICBK50248.2020.00014","DOIUrl":"https://doi.org/10.1109/ICBK50248.2020.00014","url":null,"abstract":"Knowledge graphs have been widely used in various fields such as question answering systems and recommendation systems. However, there are few researchers on combinatorial optimization problems based on knowledge graphs, which greatly delays the development of knowledge graphs. Also, when solving combinatorial optimization problems only by using knowledge graphs, it is impossible to obtain better results. In order to solve these problems, an ant colony optimization algorithm based on an adaptive strategy (AACO) is proposed, and the algorithm is applied to solve the path optimization model established by the knowledge graph. In the vector space based on knowledge graph embedding, the ant colony optimization algorithm has a good positive feedback mechanism and robustness to find effective paths between entity nodes. Experimental results show that this proposed AACO algorithm can accelerate the convergence speed and obtain better accuracy. At the same time, a global optimal solution can be achieved, which is suitable for solving combinatorial optimization problems.","PeriodicalId":432857,"journal":{"name":"2020 IEEE International Conference on Knowledge Graph (ICKG)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130010351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conversational Music Recommendation based on Bandits","authors":"Chunyi Zhou, Yuanyuan Jin, Xiaoling Wang, Yingjie Zhang","doi":"10.1109/ICBK50248.2020.00016","DOIUrl":"https://doi.org/10.1109/ICBK50248.2020.00016","url":null,"abstract":"Music is one of the most popular products in the recommender system, and there have been many various methods of exploring music recommendations. Traditional music recommendations commonly collect users’ feedbacks in limited ways for preference analysis. The text dialogue is a direct and natural interactive mode, providing diversified information. In this paper, we discuss the music recommendation in an innovative scenario – a conversational music recommendation model, which integrates the advantages both from the recommender system and dialog system. This paper adopts a ”user ask, system respond” interactive way to obtain users’ real-time requirements, and users are allowed to express their requirements on music in free text. In order to face the fast-changing music preferences, this paper adopts the bandit-based algorithm to absorb users’ attitudes to the current recommendation, and the results show these methods achieve better performance than baselines. Besides, it also constructs a music-domain knowledge graph to support the richer users’ musical expressions with millions of music items and tens of millions of relations.","PeriodicalId":432857,"journal":{"name":"2020 IEEE International Conference on Knowledge Graph (ICKG)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127760727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yan Lv, Yaojin Lin, Xiangyan Chen, Dongxing Wang, Chenxi Wang
{"title":"Online Streaming Feature Selection Based on Feature Interaction","authors":"Yan Lv, Yaojin Lin, Xiangyan Chen, Dongxing Wang, Chenxi Wang","doi":"10.1109/ICBK50248.2020.00017","DOIUrl":"https://doi.org/10.1109/ICBK50248.2020.00017","url":null,"abstract":"In many big data applications, online streaming feature selection plays a critical role in processing feature stream and dealing with high-dimensional problems. However, traditional online streaming feature selection methods focus on relevant features, irrelevant and/or redundant features, ignore the interaction between features. i.e., individual feature and label are irrelevant or weakly correlated, but when it is combined with another irrelevant or weakly feature, they show strongly correlated with label. In this paper, we propose a novel feature selection algorithm that considers feature interaction based on neighborhood rough set. This algorithm select features based on the following principles: the discrimination capability of the selected feature subset should be greater than or equal to the original feature space, and the number of features subset should be as small as possible by using feature interaction. Under this framework, we propose an online significance analysis criterion to select significance features relative to the currently selected features, and design an online redundancy analysis criterion to retain highly interactive features and filter out redundant features. Experimental results on a series of benchmark datasets show that the proposed algorithm significantly outperforms other state-of-the-art online streaming feature selection methods.","PeriodicalId":432857,"journal":{"name":"2020 IEEE International Conference on Knowledge Graph (ICKG)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126580923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Imbalanced Learning for Hospital Readmission Prediction using National Readmission Database","authors":"Shuwen Wang, Magdalyn E. Elkin, Xingquan Zhu","doi":"10.1109/ICBK50248.2020.00026","DOIUrl":"https://doi.org/10.1109/ICBK50248.2020.00026","url":null,"abstract":"In this paper, we propose to use imbalanced learning for hospital readmission prediction. The goal is to predict whether a patient, based on his/her current hospital visit records, is likely going to be re-admitted or not within 30-days after being discharged from the current hospital visit. The main challenge of hospital readmission prediction is twofold: (1) the readmission visits (i.e., the positive class) are a small portion of the total hospital visits, representing a severe class imbalance problem for learning; (2) due to privacy and health regulation, the information available for patient characterization is limited; and is often only limited to the payment level information. However, there are over 80,000 procedures code, representing a high dimensionality and high sparsity problem for learning. Motivated by the above challenges, in this paper, we design an imbalanced learning strategy to create features from patient hospital visit, by combining patient demographic information, ICD-10 clinical modification (CM) and procedure codes (PCS), and Clinical Classification Software Refined (CCSR) conversion. Instead of directly using ICD-10-CM/PCS code to characterize patients, we convert each patient’s visit to CCSR code space with a smaller feature space. By using random sampling approach to balance the sample distributions in the training set, our method achieves good performance to predict patient readmission.","PeriodicalId":432857,"journal":{"name":"2020 IEEE International Conference on Knowledge Graph (ICKG)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125778188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gongqing Wu, Shengjie Hu, Yinghuan Wang, Zan Zhang, Xianyu Bao
{"title":"Subject Event Extraction from Chinese Court Verdict Case via Frame-filling","authors":"Gongqing Wu, Shengjie Hu, Yinghuan Wang, Zan Zhang, Xianyu Bao","doi":"10.1109/ICBK50248.2020.00012","DOIUrl":"https://doi.org/10.1109/ICBK50248.2020.00012","url":null,"abstract":"At present, the query and acquisition of the fragmented knowledge in Chinese court verdicts mainly adopt the class case retrieval method based on the search engine and the rough extraction method for a part of the data in court verdicts. These traditional methods cannot structurally extract fragmented knowledge in Chinese court verdicts and meet the needs of people for the follow-up analysis of court verdicts. Thus, in this paper, we present a structured subject event extraction method (SEE) for Chinese court verdict cases combining with techniques of event extraction (EE) and attribute-value pair extraction (AVPE). Specifically, we provide a subject event representation frame for organizing fragmented knowledge in Chinese court verdict cases. Then, we extract subject events from the unstructured cases based on the trained sequence labeling models and constructed heuristic rules, and fill them into the subject event representation frame in the form of attribute-value pairs (AVPs). The experimental results show that SEE can efficiently and automatically extract subject events from Chinese court verdict cases and visually display them via frame-filling, which promotes the efficiency of people in searching for legal materials and facilitates further research and analysis.","PeriodicalId":432857,"journal":{"name":"2020 IEEE International Conference on Knowledge Graph (ICKG)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126012645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cyber Security Meets Big Knowledge: Towards a Secure HACE Theorem","authors":"B. Thuraisingham","doi":"10.1109/ICBK50248.2020.00010","DOIUrl":"https://doi.org/10.1109/ICBK50248.2020.00010","url":null,"abstract":"The HACE Theorem has emerged as a way to characterize big data. Over the years it has become fundamental to big data characterization as the Newton’s Laws are to Physics. Associated with the HACE theorem is the Big Data Processing Framework for storing, managing, analyzing and sharing massive amounts of heterogenous, autonomous and distributed data with complex and evolving relationships. This paper examines the security and privacy aspects for the HACE theorem. It argues that what is needed is a Policy-Aware Big Data Processing Framework for the collection, storage, management, mining, and sharing of the massive amounts of data. It also examines knowledge graphs to represent the big data and determines ways to reason about the graphs and yet maintain security and privacy.","PeriodicalId":432857,"journal":{"name":"2020 IEEE International Conference on Knowledge Graph (ICKG)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126986246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Robust and Domain-Adaptive Approach for Low-Resource Named Entity Recognition","authors":"Houjin Yu, Xian-Ling Mao, Zewen Chi, Wei Wei, Heyan Huang","doi":"10.1109/ICBK50248.2020.00050","DOIUrl":"https://doi.org/10.1109/ICBK50248.2020.00050","url":null,"abstract":"Recently, it has attracted much attention to build reliable named entity recognition (NER) systems using limited annotated data. Nearly all existing works heavily rely on domain-specific resources, such as external lexicons and knowledge bases. However, such domain-specific resources are often not available, meanwhile it’s difficult and expensive to construct the resources, which has become a key obstacle to wider adoption. To tackle the problem, in this work, we propose a novel robust and domain-adaptive approach RDANER for low-resource NER, which only uses cheap and easily obtainable resources. Extensive experiments on three benchmark datasets demonstrate that our approach achieves the best performance when only using cheap and easily obtainable resources, and delivers competitive results against state-of-the-art methods which use difficultly obtainable domainspecific resources. All our code and corpora can be found on https://github.com/houking-can/RDANER.","PeriodicalId":432857,"journal":{"name":"2020 IEEE International Conference on Knowledge Graph (ICKG)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131794788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Clustering via Meta-path Embedding for Heterogeneous Information Networks","authors":"Yongjun Zhang, Xiaoping Yang, Liang Wang","doi":"10.1109/ICBK50248.2020.00036","DOIUrl":"https://doi.org/10.1109/ICBK50248.2020.00036","url":null,"abstract":"A low-dimensional embedding of multiple nodes is very convenient for clustering, which is one of the most fundamental tasks for heterogeneous information networks (HINs). On the other hand, the random walk-based network embedding is proved to be equivalent to the method of matrix factorization whose computational cost is very expensive. Moreover, mapping different types of nodes into one metric space may result in incompatibility. To cope with the two challenges above, a meta-path embedding based clustering method (called MPEClus) is proposed in this paper. Firstly, the original network is transformed into several subnetworks with independent semantics specified by meta-paths to solve the incompatibility problem. Secondly, an approximate commute embedding method, bypassing eigen-decomposition to reduce computational cost, is leveraged to the representation learning of the nodes in each subnetwork. At last, a unified probabilistic generation model is designed to aggregate the vectorized representations learned in different metric spaces for clustering. Experiment results show that MPEClus is effective in HIN clustering and outperforms the state-of-the-art baselines on two real-world datasets.","PeriodicalId":432857,"journal":{"name":"2020 IEEE International Conference on Knowledge Graph (ICKG)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115844318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}