Australasian Document Computing Symposium最新文献

筛选
英文 中文
Classifying microblogs for disasters 对微博进行灾难分类
Australasian Document Computing Symposium Pub Date : 2013-12-05 DOI: 10.1145/2537734.2537737
Sarvnaz Karimi, Jie Yin, Cécile Paris
{"title":"Classifying microblogs for disasters","authors":"Sarvnaz Karimi, Jie Yin, Cécile Paris","doi":"10.1145/2537734.2537737","DOIUrl":"https://doi.org/10.1145/2537734.2537737","url":null,"abstract":"Monitoring social media in critical disaster situations can potentially assist emergency and media personnel to deal with events as they unfold, and focus their resources where they are most needed. We address the issue of filtering massive amounts of Twitter data to identify high-value messages related to disasters, and to further classify disaster-related messages into those pertaining to particular disaster types, such as earthquake, flooding, fire, or storm. Unlike post-hoc analysis that most previous studies have done, we focus on building a classification model on past incidents to detect tweets about current incidents. Our experimental results demonstrate the feasibility of using classification methods to identify disaster-related tweets. We analyse the effect of different features in classifying tweets and show that using generic features rather than incident-specific ones leads to better generalisation on the effectiveness of classifying unseen incidents.","PeriodicalId":402985,"journal":{"name":"Australasian Document Computing Symposium","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114591707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
ADCS reaches adulthood: an analysis of the conference and its community over the last eighteen years ADCS走向成年:对过去18年会议及其社区的分析
Australasian Document Computing Symposium Pub Date : 2013-12-05 DOI: 10.1145/2537734.2537741
B. Koopman, G. Zuccon, Lance De Vine, Aneesha Bakharia, P. Bruza, Laurianne Sitbon, Andrew Gibson
{"title":"ADCS reaches adulthood: an analysis of the conference and its community over the last eighteen years","authors":"B. Koopman, G. Zuccon, Lance De Vine, Aneesha Bakharia, P. Bruza, Laurianne Sitbon, Andrew Gibson","doi":"10.1145/2537734.2537741","DOIUrl":"https://doi.org/10.1145/2537734.2537741","url":null,"abstract":"How influential is the Australian Document Computing Symposium (ADCS)? What do ADCS articles speak about and who cites them? Who is the ADCS community and how has it evolved?\u0000 This paper considers eighteen years of ADCS, investigating both the conference and its community. A content analysis of the proceedings uncovers the diversity of topics covered in ADCS and how these have changed over the years. Citation analysis reveals the impact of the papers. The number of authors and where they originate from reveal who has contributed to the conference. Finally, we generate co-author networks which reveal the collaborations within the community. These networks show how clusters of researchers form, the effect geographic location has on collaboration, and how these have evolved over time.","PeriodicalId":402985,"journal":{"name":"Australasian Document Computing Symposium","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122572977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quality biased thread retrieval using the voting model 使用投票模型的质量偏差线程检索
Australasian Document Computing Symposium Pub Date : 2013-12-05 DOI: 10.1145/2537734.2537752
Ameer Tawfik Albaham, N. Salim
{"title":"Quality biased thread retrieval using the voting model","authors":"Ameer Tawfik Albaham, N. Salim","doi":"10.1145/2537734.2537752","DOIUrl":"https://doi.org/10.1145/2537734.2537752","url":null,"abstract":"Thread retrieval is an essential tool in knowledge-based forums. However, forum content quality varies from excellent to mediocre and spam; thus, search methods should find not only relevant threads but also those with high quality content. Some studies have shown that leveraging quality indicators improves thread search. However, these studies ignored the hierarchical and the conversational structures of threads in estimating topical relevance and content quality. In that regard, this paper introduces leveraging message quality indicators in ranking threads. To achieve this, we first use the Voting Model to convert message level quality features into thread level features. We then train a learning to rank method to combine these thread level features. Preliminary results with some features reveal that representing threads as collections of messages is superior to treating them as concatenations of their messages. The results show also the utility of leveraging message content quality as compared to non quality-based methods.","PeriodicalId":402985,"journal":{"name":"Australasian Document Computing Symposium","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124974332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Merging algorithms for enterprise search 企业搜索的合并算法
Australasian Document Computing Symposium Pub Date : 2013-12-05 DOI: 10.1145/2537734.2537750
Pengfei Li, Paul Thomas, D. Hawking
{"title":"Merging algorithms for enterprise search","authors":"Pengfei Li, Paul Thomas, D. Hawking","doi":"10.1145/2537734.2537750","DOIUrl":"https://doi.org/10.1145/2537734.2537750","url":null,"abstract":"Effective enterprise search must draw on a number of sources---for example web pages, telephone directories, and databases. Doing this means we need a way to make a single sorted list from results of very different types.\u0000 Many merging algorithms have been proposed but none have been applied to this, realistic, application. We report the results of an experiment which simulates heterogeneous enterprise retrieval, in a university setting, and uses multi-grade expert judgements to compare merging algorithms. Merging algorithms considered include several variants of round-robin, several methods proposed by Rasolofo et al. in the Current News Metasearcher, and four novel variations including a learned multi-weight method.\u0000 We find that the round-robin methods and one of the Rasolofo methods perform significantly worse than others. The GDS_TS method of Rasolofo achieves the highest average NDCG@10 score but the differences between it and the other GDS_methods, local reranking, and the multi-weight method were not significant.","PeriodicalId":402985,"journal":{"name":"Australasian Document Computing Symposium","volume":"184 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121631202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Efficient top-k retrieval with signatures 带签名的高效top-k检索
Australasian Document Computing Symposium Pub Date : 2013-12-05 DOI: 10.1145/2537734.2537742
Timothy Chappell, S. Geva, Anthony N. Nguyen, G. Zuccon
{"title":"Efficient top-k retrieval with signatures","authors":"Timothy Chappell, S. Geva, Anthony N. Nguyen, G. Zuccon","doi":"10.1145/2537734.2537742","DOIUrl":"https://doi.org/10.1145/2537734.2537742","url":null,"abstract":"This paper describes a new method of indexing and searching large binary signature collections to efficiently find similar signatures, addressing the scalability problem in signature search. Signatures offer efficient computation with acceptable measure of similarity in numerous applications. However, performing a complete search with a given search argument (a signature) requires a Hamming distance calculation against every signature in the collection. This quickly becomes excessive when dealing with large collections, presenting issues of scalability that limit their applicability.\u0000 Our method efficiently finds similar signatures in very large collections, trading memory use and precision for greatly improved search speed. Experimental results demonstrate that our approach is capable of finding a set of nearest signatures to a given search argument with a high degree of speed and fidelity.","PeriodicalId":402985,"journal":{"name":"Australasian Document Computing Symposium","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115418466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Integrated instance- and class-based generative modeling for text classification 集成了基于实例和类的文本分类生成建模
Australasian Document Computing Symposium Pub Date : 2013-12-05 DOI: 10.1145/2537734.2537751
Antti Puurula, Sung-Hyon Myaeng
{"title":"Integrated instance- and class-based generative modeling for text classification","authors":"Antti Puurula, Sung-Hyon Myaeng","doi":"10.1145/2537734.2537751","DOIUrl":"https://doi.org/10.1145/2537734.2537751","url":null,"abstract":"Statistical methods for text classification are predominantly based on the paradigm of class-based learning that associates class variables with features, discarding the instances of data after model training. This results in efficient models, but neglects the fine-grained information present in individual documents. Instance-based learning uses this information, but suffers from data sparsity with text data. In this paper, we propose a generative model called Tied Document Mixture (TDM) for extending Multinomial Naive Bayes (MNB) with mixtures of hierarchically smoothed models for documents. Alternatively, TDM can be viewed as a Kernel Density Classifier using class-smoothed Multinomial kernels. TDM is evaluated for classification accuracy on 14 different datasets for multi-label, multi-class and binary-class text classification tasks and compared to instance- and class-based learning baselines. The comparisons to MNB demonstrate a substantial improvement in accuracy as a function of available training documents per class, ranging up to average error reductions of over 26% in sentiment classification and 65% in spam classification. On average TDM is as accurate as the best discriminative classifiers, but retains the linear time complexities of instance-based learning methods, with exact algorithms for both model estimation and inference.","PeriodicalId":402985,"journal":{"name":"Australasian Document Computing Symposium","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130467382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Choices in batch information retrieval evaluation 批量信息检索评价中的选择
Australasian Document Computing Symposium Pub Date : 2013-12-05 DOI: 10.1145/2537734.2537745
Falk Scholer, Alistair Moffat, Paul Thomas
{"title":"Choices in batch information retrieval evaluation","authors":"Falk Scholer, Alistair Moffat, Paul Thomas","doi":"10.1145/2537734.2537745","DOIUrl":"https://doi.org/10.1145/2537734.2537745","url":null,"abstract":"Web search tools are used on a daily basis by billions of people. The commercial providers of these services spend large amounts of money measuring their own effectiveness and benchmarking against their competitors; nothing less than their corporate survival is at stake. Techniques for offline or \"batch\" evaluation of search quality have received considerable attention, spanning ways of constructing relevance judgments; ways of using them to generate numeric scores; and ways of inferring system \"superiority\" from sets of such scores.\u0000 Our purpose in this paper is consider these mechanisms as a chain of inter-dependent activities, in order to explore some of the ramifications of alternative components. By disaggregating the different activities, and asking what the ultimate objective of the measurement process is, we provide new insights into evaluation approaches, and are able to suggest new combinations that might prove fruitful avenues for exploration. Our observations are examined with reference to data collected from a user study covering 34 users undertaking a total of six search tasks each, using two systems of markedly different quality.\u0000 We hope to encourage broader awareness of the many factors that go into an evaluation of search effectiveness, and of the implications of these choices, and encourage researchers to carefully report all aspects of the evaluation process when describing their system performance experiments.","PeriodicalId":402985,"journal":{"name":"Australasian Document Computing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131889517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Economic models of search 搜索的经济模型
Australasian Document Computing Symposium Pub Date : 2013-12-05 DOI: 10.1145/2537734.2537735
L. Azzopardi
{"title":"Economic models of search","authors":"L. Azzopardi","doi":"10.1145/2537734.2537735","DOIUrl":"https://doi.org/10.1145/2537734.2537735","url":null,"abstract":"Searching is inherently an interactive process usually requiring a number of queries to be submitted and a number of documents to be assessed in order to find the desired amount of relevant information. While numerous models of search have been proposed, they have been largely conceptual in nature providing a descriptive account of the search process. For example, Bates' Berry Picking metaphor aptly describes how information seekers forage for relevant information [4]. However it lacks any predictive or explanatory power. In this talk, I will outline how microeconomic theory can be applied to interactive information retrieval, where the search process can be viewed as a combination of inputs (i.e. queries and assessments) which are used to \"produce\" output (i.e. relevance). Under this view, it is possible to build models that not only describe the relationship between interaction, cost and gain, but also explain and predict behaviour. During the talk, I will run through a number of examples of how economics can explain different behaviours. For example, why PhD students should search more than their supervisors (using an economic model developed by Cooper [6]), why queries are short [1], why Boolean searchers need to explore more results, and why it is okay to look at the first few results when searching the web [2]. I shall then describe how the cost of different interactions affect search behaviour [3], before extending the current theory to include other variables (such as the time spent on the search result page, the interaction with snippets, etc) to create more sophisticated and realistic models. Essentially, I will argue that by using such models we can:\u0000 1. theorise and predict how users will behave when interacting with systems,\u0000 2. ascertain how the costs of different interaction will influence search behaviour,\u0000 3. understand why particular interaction styles, strategies, techniques are or are not adopted by users, and,\u0000 4. determine what interactions and functionalities are worthwhile based on their expected gain and associated costs.","PeriodicalId":402985,"journal":{"name":"Australasian Document Computing Symposium","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132027033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Graph-based concept weighting for medical information retrieval 基于图的医学信息检索概念加权
Australasian Document Computing Symposium Pub Date : 2012-12-05 DOI: 10.1145/2407085.2407096
B. Koopman, G. Zuccon, P. Bruza, Laurianne Sitbon, Michael Lawley
{"title":"Graph-based concept weighting for medical information retrieval","authors":"B. Koopman, G. Zuccon, P. Bruza, Laurianne Sitbon, Michael Lawley","doi":"10.1145/2407085.2407096","DOIUrl":"https://doi.org/10.1145/2407085.2407096","url":null,"abstract":"This paper presents a graph-based method to weight medical concepts in documents for the purposes of information retrieval. Medical concepts are extracted from free-text documents using a state-of-the-art technique that maps n-grams to concepts from the SNOMED CT medical ontology. In our graph-based concept representation, concepts are vertices in a graph built from a document, edges represent associations between concepts. This representation naturally captures dependencies between concepts, an important requirement for interpreting medical text, and a feature lacking in bag-of-words representations.\u0000 We apply existing graph-based term weighting methods to weight medical concepts. Using concepts rather than terms addresses vocabulary mismatch as well as encapsulates terms belonging to a single medical entity into a single concept. In addition, we further extend previous graph-based approaches by injecting domain knowledge that estimates the importance of a concept within the global medical domain.\u0000 Retrieval experiments on the TREC Medical Records collection show our method outperforms both term and concept baselines. More generally, this work provides a means of integrating background knowledge contained in medical ontologies into data-driven information retrieval approaches.","PeriodicalId":402985,"journal":{"name":"Australasian Document Computing Symposium","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117354339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Reordering an index to speed query processing without loss of effectiveness 重新排序索引以加快查询处理速度,同时不损失效率
Australasian Document Computing Symposium Pub Date : 2012-12-05 DOI: 10.1145/2407085.2407088
D. Hawking, Timothy Jones
{"title":"Reordering an index to speed query processing without loss of effectiveness","authors":"D. Hawking, Timothy Jones","doi":"10.1145/2407085.2407088","DOIUrl":"https://doi.org/10.1145/2407085.2407088","url":null,"abstract":"Following Long and Suel, we empirically investigate the importance of document order in search engines which rank documents using a combination of dynamic (query-dependent) and static (query-independent) scores, and use document-at-a-time (DAAT) processing. When inverted file postings are in collection order, assigning document numbers in order of descending static score supports lossless early termination while maintaining good compression.\u0000 Since static scores may not be available until all documents have been gathered and indexed, we build a tool for reordering an existing index and show that it operates in less than 20% of the original indexing time. We note that this additional cost is easily recouped by savings at query processing time. We compare best early-termination points for several different index orders on three enterprise search collections (a whole-of-government index with two very different query sets, and a collection from a UK university). We also present results for the same orders for ClueWeb09-CatB. Our evaluation focuses on finding results likely to be clicked on by users of Web or website search engines --- Nav and Key results in the TREC 2011 Web Track judging scheme.\u0000 The orderings tested are Original, Reverse, Random, and QIE (descending order of static score). For three enterprise search test sets we find that QIE order can achieve close-to-maximal search effectiveness with much lower computational cost than for other orderings. Additionally, reordering has negligible impact on compressed index size for indexes that contain position information. Our results for an artificial query set against the TREC ClueWeb09 Category B collection are much more equivocal and we canvass possible explanations for future investigation.","PeriodicalId":402985,"journal":{"name":"Australasian Document Computing Symposium","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115588017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信