{"title":"Quantum Language Model-based Query Expansion","authors":"Qiuchi Li, M. Melucci, P. Tiwari","doi":"10.1145/3234944.3234970","DOIUrl":"https://doi.org/10.1145/3234944.3234970","url":null,"abstract":"The analogy between words, documents and queries and the Quantum Mechanics (QM) concepts gives rise to various quantum-inspired Information Retrieval (IR) models. As one of the most successful applications among them, Quantum Language Model (QLM) achieves superior performances compared to various classical models on ad-hoc retrieval tasks. However, the EM-based estimation strategy for QLM is limited in that it cannot efficiently converge to global optimum. As a result, subsequent QLM-based models are more or less restricted to a limited vocabulary. In order to ease this limitation, this study investigates a query expansion framework on the QLM basis. Essentially, the additional terms are selected from the constructed QLM of top-K returned documents in the initial ranking, and a re-ranking is conducted on the expanded query to generate the final ranks. Experiments on TREC 2013 and 2014 session track datasets demonstrate the effectiveness of our model.","PeriodicalId":193631,"journal":{"name":"Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130695595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Binary Classification Model Inspired from Quantum Detection Theory","authors":"E. D. Buccio, Qiuchi Li, M. Melucci, P. Tiwari","doi":"10.1145/3234944.3234979","DOIUrl":"https://doi.org/10.1145/3234944.3234979","url":null,"abstract":"Despite its long history, classification is still a subject of extensive research because new application domains require more effective algorithms than the state-of-the-art classification algorithms, which rely on the logical theory of sets, the theory of probability and the algebra of vector spaces. The combination of distinct theoretical frameworks may be the key to making an important step forward toward a stable and significant improvement in classification effectiveness and, to the same extent improved Quantum Mechanics (QM) signal detection. QM may give rise to a new theoretical framework for classification, since it essentially moves the optimal bound of effectiveness beyond the levels made possible by the state-of-the-art classification algorithms. In this paper, we propose a binary classification model inspired by quantum detection theory in an effort to investigate how much benefit it brings as compared to classical models. Our experiments suggest that the improvement in classification effectiveness can be obtained, although the potential of quantum detection can only be partially exploited.","PeriodicalId":193631,"journal":{"name":"Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130177313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Entire Information Attentive GRU for Text Representation","authors":"Guoxiu He, Wei Lu","doi":"10.1145/3234944.3234947","DOIUrl":"https://doi.org/10.1145/3234944.3234947","url":null,"abstract":"Recurrent Neural Networks~(RNNs), such as Long Short-Term Memory~(LSTM) and Gated Recurrent Unit~(GRU), have been widely utilized in sequence representation. However, RNNs neglect variational information and long-term dependency. In this paper, we propose a new neural network structure for extracting a comprehension sequence embedding by handling the entire representation of the sequence. Unlike previous works that put attention mechanism after all steps of GRU, we add the entire representation to the input of the GRU which means the GRU model takes the entire information of the sequence into consideration in every step. We provide three various strategies to adding the entire information which are the Convolutional Neural Network~(CNN) based attentive GRU~(CBAG), the GRU inner attentive GRU~(GIAG) and the pre-trained GRU inner attentive GRU~(Pre-GIAG). To evaluate our proposed methods, we conduct extensive experiments on a benchmark sentiment classification dataset. Our experimental results show that our models outperform state-of-the-art baselines significantly.","PeriodicalId":193631,"journal":{"name":"Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123216868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classifying Community QA Questions That Contain an Image","authors":"Kenta Tamaki, Riku Togashi, Sosuke Kato, Sumio Fujita, Hideyuki Maeda, T. Sakai","doi":"10.1145/3234944.3234948","DOIUrl":"https://doi.org/10.1145/3234944.3234948","url":null,"abstract":"We consider the problem of automatically assigning a category to a given question posted to a Community Question Answering (CQA) site, where the question contains not only text but also an image. For example, CQA users may post a photograph of a dress and ask the community \"Is this appropriate for a wedding?'' where the appropriate category for this question might be \"Manners, Ceremonial occasions.'' We tackle this problem using Convolutional Neural Networks with a DualNet architecture for combining the image and text representations. Our experiments with real data from Yahoo Chiebukuro and crowdsourced gold-standard categories show that the DualNet approach outperforms a text-only baseline ($p=.0000$), a sum-and-product baseline ($p=.0000$), Multimodal Compact Bilinear pooling ($p=.0000$), and a combination of sum-and-product and MCB ($p=.0000$), where the p-values are based on a randomised Tukey Honestly Significant Difference test with $B = 5000$ trials.","PeriodicalId":193631,"journal":{"name":"Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130561678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Vertical PRF Architecture for Microblog Search","authors":"Flávio Martins, João Magalhães, Jamie Callan","doi":"10.1145/3234944.3234960","DOIUrl":"https://doi.org/10.1145/3234944.3234960","url":null,"abstract":"In microblog retrieval, query expansion can be essential to obtain good search results due to the short size of queries and posts. Since information in microblogs is highly dynamic, an up-to-date index coupled with pseudo-relevance feedback (PRF) with an external corpus has a higher chance of retrieving more relevant documents and improving ranking. In this paper, we focus on the research question:how can we reduce the query expansion computational cost while maintaining the same retrieval precision as standard PRF? Therefore, we propose to accelerate the query expansion step of pseudo-relevance feedback. The hypothesis is that using an expansion corpus organized into verticals for expanding the query, will lead to a more efficient query expansion process and improved retrieval effectiveness. Thus, the proposed query expansion method uses a distributed search architecture and resource selection algorithms to provide an efficient query expansion process. Experiments on the TREC Microblog datasets show that the proposed approach can match or outperform standard PRF in MAP and NDCG@30, with a computational cost that is three orders of magnitude lower.","PeriodicalId":193631,"journal":{"name":"Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131741725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Empirical Study of Multi-level Convolution Models for IR Based on Representations and Interactions","authors":"Yifan Nie, Yanling Li, Jian-Yun Nie","doi":"10.1145/3234944.3234954","DOIUrl":"https://doi.org/10.1145/3234944.3234954","url":null,"abstract":"Deep learning models have been employed to perform IR tasks and have shown competitive results. Depending on the structure of the models, previous deep IR models could be roughly divided into: representation-based models and interaction-based models. A number of experiments have been conducted to test these models, but often under different conditions, making it difficult to draw a clear conclusion on their comparison. In order to compare the two learning schemas for ad hoc search under the same condition, we build similar convolution networks to learn either representations or interaction patterns between document and query and test them on the same test collection. In addition, we also propose multi-level matching models to cope with various types of query, rather than the existing single-level matching. Our experiments show that interaction-based approach generally performs better than representation-based approach, and multi-level matching performs better than single-level matching. We will provide some possible explanations to these observations.","PeriodicalId":193631,"journal":{"name":"Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127258396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Extended Query Performance Prediction Framework Utilizing Passage-Level Information","authors":"Haggai Roitman","doi":"10.1145/3234944.3234946","DOIUrl":"https://doi.org/10.1145/3234944.3234946","url":null,"abstract":"We show that document-level post-retrieval query performance prediction (QPP) methods are mostly suited for short query prediction tasks; such methods perform significantly worse in verbose (long and informative) query prediction settings. To address the prediction quality gap among query lengths, we propose a novel passage-level post-retrieval QPP framework. Our empirical analysis demonstrates that, those QPP methods that utilize passage-level information are much better suited for verbose QPP settings. Moreover, our proposed predictors, which utilize both document-level and passage-level information provide a more robust prediction which is less sensitive to query length.","PeriodicalId":193631,"journal":{"name":"Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122599095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Product Question Intent Detection using Indicative Clause Attention and Adversarial Learning","authors":"Qian Yu, Wai Lam","doi":"10.1145/3234944.3234961","DOIUrl":"https://doi.org/10.1145/3234944.3234961","url":null,"abstract":"Due to the provision of QA service in many E-commerce sites, product question understanding becomes important. Product questions have different characteristics from traditional questions in that they are long and verbose as well as associated with different intents unique for the E-commerce setting. We conduct a thorough investigation on product questions covering different product categories from some commercial E-commerce sites. A set of question intent classes suitable for the E-commerce setting are identified. We also investigate the challenges of automatic intent detection and develop an intent detection framework based on a tailor-made deep neural model. The first characteristic of our framework is that it is capable of coping with long and verbose questions via identifying the indicative clauses. The second characteristic is that an adversarial learning algorithm is designed making use of an auxiliary classifier for avoiding the interference of product aspects with question intent detection quality. Extensive experiment results demonstrate the effectiveness of the proposed framework.","PeriodicalId":193631,"journal":{"name":"Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125506124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding the Representational Power of Neural Retrieval Models Using NLP Tasks","authors":"Daniel Cohen, Brendan T. O'Connor, W. Bruce Croft","doi":"10.1145/3234944.3234959","DOIUrl":"https://doi.org/10.1145/3234944.3234959","url":null,"abstract":"The ease of constructing effective neural networks has resulted in a large number of varying architectures iteratively improving performance on a task. Due to the nature of these models being black boxes, standard weight inspection is difficult. We propose a probe based methodology to evaluate what information is important or extraneous at each level of a network. We input natural language processing datasets into a trained answer passage neural network. Each layer of the neural network is used as input into a unique classifier, or probe, to correctly label that input with respect to a natural language processing task, probing the internal representations for information. Using this approach, we analyze the information relevant to retrieving answer passages from the perspective of information needed for part of speech tagging, named entity retrieval, sentiment classification, and textual entailment. We show a significant information need difference between two seemingly similar question answering collections, and demonstrate that passage retrieval and textual entailment share a common information space, while POS and NER information is used only at a compositional level in the lower layers of an information retrieval model. Lastly, we demonstrate that incorporating this information into a multitask environment is correlated to the information retained by these models during the probe inspection phase.","PeriodicalId":193631,"journal":{"name":"Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129384086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"What Will Search Engines be Changed by NLP Advancements","authors":"M. Zhou","doi":"10.1145/3234944.3241521","DOIUrl":"https://doi.org/10.1145/3234944.3241521","url":null,"abstract":"I think that the vision of a search engine is \"Natural Search\" with which users input his or her search intent in a natural way such as using natural language or an image and immediately obtains the desired accurate information which is concisely and comprehensibly expressed. During this process, NLP is undoubtedly one of the most crucial technologies. In the past, the search engine uses limited and shallow NLP technologies because NLP technology is not as mature as people have expected. In recent years, we have witnessed that NLP has made huge advances in various tasks such as semantic parser, question-answering, machine translation, machine reading comprehension and text generation. I think that now it is the time to consider applying these new technologies to a search engine to further improve the intelligence and naturalness of the search process. It is necessary to understand the new progress of NLP and their potential impact to a search engine. In this talk, I first provide an overview of advancements of methodology and technology in NLP filed in recent years. Then I will share my thoughts about the promising change of search engines brought by these new NLP technologies. I will further elaborate my thoughts on changing search engine with a set of intelligent question-answering techniques comprising semantic parser, question-answering and machine reading comprehension. Although these new promising NLP have rapidly brought meaningful change to a search engine, there are still many problems unsolved. As a conclusion, a list of the challenging topics will be proposed with my initial thoughts of solutions.","PeriodicalId":193631,"journal":{"name":"Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131842254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}