{"title":"Estimating Domain-Specific User Expertise for Answer Retrieval in Community Question-Answering Platforms","authors":"Wern Han Lim, Mark James Carman, S. J. Wong","doi":"10.1145/3015022.3015032","DOIUrl":"https://doi.org/10.1145/3015022.3015032","url":null,"abstract":"Community Question-Answering (CQA) platforms leverage the inherent wisdom of the crowd - enabling users to retrieve quality information from domain experts through natural language. An important and challenging task is to identify reliable and trusted experts on large popular CQA platforms. State-of-the-art graph-based approaches to expertise estimation consider only user-user interactions without taking the relative contribution of individual answers into account, while pairwise-comparison approaches consider only pairs involving the best-answerer of each question. This research argues that there is a need to account for the user's relative contribution towards solving the question when estimating user expertise and proposes a content-agnostic measure of user contributions. This addition is incorporated into a competition-based approach for ranking users' question answering ability. The paper analyses how improvements in user expertise estimation impact on applications in expert search and answer quality prediction. Experiments using the Yahoo! Chiebukuro data show encouraging performance improvements and robustness over state-of-the-art approaches.","PeriodicalId":334601,"journal":{"name":"Proceedings of the 21st Australasian Document Computing Symposium","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125353077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Influence of Topic Difficulty, Relevance Level, and Document Ordering on Relevance Judging","authors":"T. T. Damessie, Falk Scholer, J. Culpepper","doi":"10.1145/3015022.3015033","DOIUrl":"https://doi.org/10.1145/3015022.3015033","url":null,"abstract":"Judging the relevance of documents for an information need is an activity that underpins the most widely-used approach in the evaluation of information retrieval systems. In this study we investigate the relationship between how long it takes an assessor to judge document relevance, and three key factors that may influence the judging scenario: the difficulty of the search topic for which relevance is being assessed; the degree to which the documents are relevant to the search topic; and, the order in which the documents are presented for judging. Two potential confounding influences on judgment speed are differences in individual reading ability, and the length of documents that are being assessed. We therefore propose two measures to investigate the above factors: normalized processing speed (NPS), which adjusts the number of words that were processed per minute by taking into account differences in reading speed between judges, and normalized dwell time (NDT), which adjusts the duration that a judge spent reading a document relative to document length. Note that these two measures have different relationships with overall judgment speed: a direct relationship for NPS, and an inverse relationship for NDT. The results of a small-scale user study show a statistically significant relationship between judgment speed and topic difficulty: for easier topics, assessors process more quickly (higher NPS), and spend less time overall (lower NDT). There is also a statistically significant relationship between the level of relevance of the document being assessed and overall judgment speed, with assessors taking less time for non-relevant documents. Finally, our results suggest that the presentation order of documents can also affect overall judgment speed, with assessors spending less time (smaller NDT) when documents are presented in relevance order than docID order. However, these ordering effects are not significant when also accounting for document length variance (NPS).","PeriodicalId":334601,"journal":{"name":"Proceedings of the 21st Australasian Document Computing Symposium","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115331857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Judgment Pool Effects Caused by Query Variations","authors":"Alistair Moffat","doi":"10.1145/3015022.3015025","DOIUrl":"https://doi.org/10.1145/3015022.3015025","url":null,"abstract":"Batch-mode retrieval evaluation relies on suitable relevance judgments being available. Here we explore the implications on pool size of adopting a \"query variations\" approach to collection construction. Using the resources provided as part of the UQV100 collection [Bailey et al., SIGIR 2016] and a total of five different systems, we show that pool size is as much affected by the number of query variations involved as it is by the number of contributing systems, and that systems and users are independent effects. That is, if both system and query variation are to be accommodated in retrieval experimentation, the cost of performing the required judgments compounds.","PeriodicalId":334601,"journal":{"name":"Proceedings of the 21st Australasian Document Computing Symposium","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121942113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Position-Based Method for the Extraction of Financial Information in PDF Documents","authors":"Benoit Potvin, Roger Villemaire, N. Le","doi":"10.1145/3015022.3015024","DOIUrl":"https://doi.org/10.1145/3015022.3015024","url":null,"abstract":"Financial documents are omnipresent and necessitate extensive human efforts in order to extract, validate and export their content. Considering the high importance of such data for effective business decisions, the need for accuracy goes beyond any attempt to accelerate the process or save resources. While many methods have been suggested in the literature, the problem to automatically extract reliable financial data remains difficult to solve in practice and even more challenging to implement in a real life context. This difficulty is driven by the specific nature of financial text where relevant information is principally contained in tables of varying formats. Table Extraction (TE) is considered as an essential but difficult step for restructuring data in a handleable format by identifying and decomposing table components. In this paper, we present a novel method for extracting financial information by the means of two simple heuristics. Our approach is based on the idea that the position of information, in unstructured but visually rich documents - as it is the case for the Portable Document Format (PDF) - is an indicator of semantic relatedness. This solution has been developed in partnership with the Caisse de Depot et Placement du Québec. We present here our method and its evaluation on a corpus of 600 financial documents, where an F-measure of 91% is reached.","PeriodicalId":334601,"journal":{"name":"Proceedings of the 21st Australasian Document Computing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128732616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of Retrieval Algorithms for Expertise Search","authors":"Gaya K. Jayasinghe, Sarvnaz Karimi, M. Ayre","doi":"10.1145/3015022.3015035","DOIUrl":"https://doi.org/10.1145/3015022.3015035","url":null,"abstract":"Evaluation of expertise search systems is a non-trivial task. While in a typical search engine the responses to user queries are documents, the search results for an expertise retrieval system are people. The relevancy scores indicate how knowledgeable they are on a given topic. Within an organisation, such a ranking of employees could potentially be difficult as well as controversial. We introduce an in-house capability search system built for an organisation with a diverse range of disciplines. We report on two attempts of evaluating six different ranking algorithms implemented for this system. Evaluating the system using relevance judgements produced in each of the two attempts leads to an understanding of how different methods of collecting judgements on people's expertise can lead to different effectiveness of algorithms.","PeriodicalId":334601,"journal":{"name":"Proceedings of the 21st Australasian Document Computing Symposium","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121101130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}