M. Vazirgiannis, Fragkiskos D. Malliaros, Giannis Nikolentzos
{"title":"GraphRep: Boosting Text Mining, NLP and Information Retrieval with Graphs","authors":"M. Vazirgiannis, Fragkiskos D. Malliaros, Giannis Nikolentzos","doi":"10.1145/3269206.3274273","DOIUrl":"https://doi.org/10.1145/3269206.3274273","url":null,"abstract":"Graphs have been widely used as modeling tools in Natural Language Processing (NLP), Text Mining (TM) and Information Retrieval (IR). Traditionally, the unigram bag-of-words representation is applied; that way, a document is represented as a multiset of its terms, disregarding dependencies between the terms. Although several variants and extensions of this modeling approach have been proposed, the main weakness comes from the underlying term independence assumption; the order of the terms within a document is completely disregarded and any relationship between terms is not taken into account in the final task. To deal with this problem, the research community has explored various representations, and to this direction, graphs constitute a well-developed model for text representation. The goal of this tutorial is to offer a comprehensive presentation of recent methods that rely on graph-based text representations to deal with various tasks in Text Mining, NLP and IR.","PeriodicalId":331886,"journal":{"name":"Proceedings of the 27th ACM International Conference on Information and Knowledge Management","volume":"222 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124391317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Network-based Receivable Financing","authors":"Ilaria Bordino, Francesco Gullo","doi":"10.1145/3269206.3272017","DOIUrl":"https://doi.org/10.1145/3269206.3272017","url":null,"abstract":"Receivable financing -- the process whereby cash is advanced to firms against receivables their customers have yet to pay -- is a well-established funding source for businesses. In this paper we present a novel, collaborative approach to receivable financing: unlike existing centralized approaches where the financing company acts as a server handling each request individually, our approach employs a network perspective where money flow is triggered among customers themselves. The main contribution of this work is to provide a principled formulation of the network-based receivable-settlement strategy, and show how all algorithmic challenges posed by the design of this strategy are achieved in practice. Extensive experiments on real receivable data attest the effectiveness of the proposed methods.","PeriodicalId":331886,"journal":{"name":"Proceedings of the 27th ACM International Conference on Information and Knowledge Management","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117165088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yukun Ding, Jinglan Liu, Jinjun Xiong, Meng Jiang, Yiyu Shi
{"title":"Optimizing Boiler Control in Real-Time with Machine Learning for Sustainability","authors":"Yukun Ding, Jinglan Liu, Jinjun Xiong, Meng Jiang, Yiyu Shi","doi":"10.1145/3269206.3272024","DOIUrl":"https://doi.org/10.1145/3269206.3272024","url":null,"abstract":"In coal-fired power plants, it is critical to improve the operational efficiency of boilers for sustainability. In this work, we formulate real-time boiler control as an optimization problem that looks for the best distribution of temperature in different zones and oxygen content from the flue to improve the boiler's stability and energy efficiency. We employ an efficient algorithm by integrating appropriate machine learning and optimization techniques. We obtain a large dataset collected from a real boiler for more than two months from our industry partner, and conduct extensive experiments to demonstrate the effectiveness and efficiency of the proposed algorithm.","PeriodicalId":331886,"journal":{"name":"Proceedings of the 27th ACM International Conference on Information and Knowledge Management","volume":"445 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127219341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CEC","authors":"Daniel Deutch, Nave Frost","doi":"10.1615/atoz.c.cec","DOIUrl":"https://doi.org/10.1615/atoz.c.cec","url":null,"abstract":"","PeriodicalId":331886,"journal":{"name":"Proceedings of the 27th ACM International Conference on Information and Knowledge Management","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115472146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jin Yao Chin, Kaiqi Zhao, Shafiq R. Joty, Gao Cong
{"title":"ANR","authors":"Jin Yao Chin, Kaiqi Zhao, Shafiq R. Joty, Gao Cong","doi":"10.1145/3269206.3271810","DOIUrl":"https://doi.org/10.1145/3269206.3271810","url":null,"abstract":"Textual reviews, which are readily available on many e-commerce and review websites such as Amazon and Yelp, serve as an invaluable source of information for recommender systems. However, not all parts of the reviews are equally important, and the same choice of words may reflect a different meaning based on its context. In this paper, we propose a novel end-to-end Aspect-based Neural Recommender (ANR) to perform aspect-based representation learning for both users and items via an attention-based component. Furthermore, we model the multi-faceted process behind how users rate items by estimating the aspect-level user and item importance by adapting the neural co-attention mechanism. Our proposed model concurrently address several shortcomings of existing recommender systems, and a thorough experimental study on 25 benchmark datasets from Amazon and Yelp shows that ANR significantly outperforms recently proposed state-of-the-art baselines such as DeepCoNN, D-Attn and ALFM.","PeriodicalId":331886,"journal":{"name":"Proceedings of the 27th ACM International Conference on Information and Knowledge Management","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122711416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CIKM 2018 Co-Located Workshops Summary","authors":"A. Cuzzocrea, F. Bonchi, D. Gunopulos","doi":"10.1145/3269206.3274267","DOIUrl":"https://doi.org/10.1145/3269206.3274267","url":null,"abstract":"This paper provides an overview of the workshops co-located with the 27th ACM International Conference on Information and Knowledge Management (CIKM 2018), held during October 22-26, 2018 in Turin, Italy.","PeriodicalId":331886,"journal":{"name":"Proceedings of the 27th ACM International Conference on Information and Knowledge Management","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122999329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cheng Cao, Zhengzhang Chen, James Caverlee, L. Tang, Chen Luo, Zhichun Li
{"title":"Behavior-based Community Detection: Application to Host Assessment In Enterprise Information Networks","authors":"Cheng Cao, Zhengzhang Chen, James Caverlee, L. Tang, Chen Luo, Zhichun Li","doi":"10.1145/3269206.3272022","DOIUrl":"https://doi.org/10.1145/3269206.3272022","url":null,"abstract":"Community detection in complex networks is a fundamental problem that attracts much attention across various disciplines. Previous studies have been mostly focusing on external connections between nodes (i.e., topology structure) in the network whereas largely ignoring internal intricacies (i.e., local behavior) of each node. A pair of nodes without any interaction can still share similar internal behaviors. For example, in an enterprise information network, compromised computers controlled by the same intruder often demonstrate similar abnormal behaviors even if they do not connect with each other. In this paper, we study the problem of community detection in enterprise information networks, where large-scale internal events and external events coexist on each host. The discovered host communities, capturing behavioral affinity, can benefit many comparative analysis tasks such as host anomaly assessment. In particular, we propose a novel community detection framework to identify behavior-based host communities in enterprise information networks, purely based on large-scale heterogeneous event data. We continue proposing an efficient method for assessing host's anomaly level by leveraging the detected host communities. Experimental results on enterprise networks demonstrate the effectiveness of our model.","PeriodicalId":331886,"journal":{"name":"Proceedings of the 27th ACM International Conference on Information and Knowledge Management","volume":"10 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114047198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hamed Zamani, Mostafa Dehghani, W. Bruce Croft, E. Learned-Miller, J. Kamps
{"title":"From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing","authors":"Hamed Zamani, Mostafa Dehghani, W. Bruce Croft, E. Learned-Miller, J. Kamps","doi":"10.1145/3269206.3271800","DOIUrl":"https://doi.org/10.1145/3269206.3271800","url":null,"abstract":"The availability of massive data and computing power allowing for effective data driven neural approaches is having a major impact on machine learning and information retrieval research, but these models have a basic problem with efficiency. Current neural ranking models are implemented as multistage rankers: for efficiency reasons, the neural model only re-ranks the top ranked documents retrieved by a first-stage efficient ranker in response to a given query. Neural ranking models learn dense representations causing essentially every query term to match every document term, making it highly inefficient or intractable to rank the whole collection. The reliance on a first stage ranker creates a dual problem: First, the interaction and combination effects are not well understood. Second, the first stage ranker serves as a \"gate-keeper\" or filter, effectively blocking the potential of neural models to uncover new relevant documents. In this work, we propose a standalone neural ranking model (SNRM) by introducing a sparsity property to learn a latent sparse representation for each query and document. This representation captures the semantic relationship between the query and documents, but is also sparse enough to enable constructing an inverted index for the whole collection. We parameterize the sparsity of the model to yield a retrieval model as efficient as conventional term based models. Our model gains in efficiency without loss of effectiveness: it not only outperforms the existing term matching baselines, but also performs similarly to the recent re-ranking based neural models with dense representations. Our model can also take advantage of pseudo-relevance feedback for further improvements. More generally, our results demonstrate the importance of sparsity in neural IR models and show that dense representations can be pruned effectively, giving new insights about essential semantic features and their distributions.","PeriodicalId":331886,"journal":{"name":"Proceedings of the 27th ACM International Conference on Information and Knowledge Management","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114050628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Point Symmetry-based Deep Clustering","authors":"Jose G. Moreno","doi":"10.1145/3269206.3269328","DOIUrl":"https://doi.org/10.1145/3269206.3269328","url":null,"abstract":"Clustering is a central task in unsupervised learning. Recent advances that perform clustering into learned deep features (such as DEC[14], IDEC [6] or VaDe [10]) have shown improvements over classical algorithms, but most of them are based on the Euclidean distance. Moreover, symmetry-based distances have shown to be a powerful tool to distinguish symmetric shapes -- such as circles, ellipses, squares, etc. This paper presents an adaptation of symmetry-based distances into deep clustering algorithms, named SymDEC. Our results show that the proposed strategy outperforms significantly the existing Euclidean-based deep clustering as well as recent symmetry-based algorithms in several of the synthetic symmetric and UCI studied datasets.","PeriodicalId":331886,"journal":{"name":"Proceedings of the 27th ACM International Conference on Information and Knowledge Management","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114392186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. T. Kwee, Meng-Fen Chiang, Philips Kokoh Prasetyo, Ee-Peng Lim
{"title":"Traffic-Cascade: Mining and Visualizing Lifecycles of Traffic Congestion Events Using Public Bus Trajectories","authors":"A. T. Kwee, Meng-Fen Chiang, Philips Kokoh Prasetyo, Ee-Peng Lim","doi":"10.1145/3269206.3269216","DOIUrl":"https://doi.org/10.1145/3269206.3269216","url":null,"abstract":"As road transportation supports both economic and social activities in developed cities, it is important to maintain smooth traffic on all highways and local roads. Whenever possible, traffic congestions should be detected early and resolved quickly. While existing traffic monitoring dashboard systems have been put in place in many cities, these systems require high-cost vehicle speed monitoring instruments and detect traffic congestion as independent events. There is a lack of low-cost dashboards to inspect and analyze the lifecycle of traffic congestion which is critical in assessing the overall impact of congestion, determining the possible the source(s) of congestion and its evolution. In the absence of publicly available sophisticated road sensor data which measures on-road vehicle speed, we make use of publicly available vehicle trajectory data to detect the lifecycle of traffic congestion, also known as congestion cascade. We have developed Traffic-Cascade, a dashboard system to identify traffic congestion events, compile them into congestion cascades, and visualize them on a web dashboard. Traffic-Cascade unveils spatio-temporal insights of the congestion cascades.","PeriodicalId":331886,"journal":{"name":"Proceedings of the 27th ACM International Conference on Information and Knowledge Management","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128481293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}