Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation最新文献

A Comparative Study of Named Entity Recognition for Telugu 泰卢固语命名实体识别的比较研究

Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation Pub Date : 2017-12-08 DOI: 10.1145/3158354.3158358

SaiKiranmai Gorla, N. B. Murthy, Aruna Malapati

引用次数: 2

Feature Space of Deep Learning and its Importance: Comparison of Clustering Techniques on the Extended Space of ML-ELM 深度学习的特征空间及其重要性:ML-ELM扩展空间上聚类技术的比较

Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation Pub Date : 2017-12-08 DOI: 10.1145/3158354.3158359

R. Roul, Amit Agarwal

引用次数: 3

A Comparison of Automatic Search Query Enhancement Algorithms That Utilise Wikipedia as a Source of A Priori Knowledge 利用维基百科作为先验知识来源的自动搜索查询增强算法的比较

Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation Pub Date : 2017-12-08 DOI: 10.1145/3158354.3158356

Kyle Goslin, M. Hofmann

{"title":"A Comparison of Automatic Search Query Enhancement Algorithms That Utilise Wikipedia as a Source of A Priori Knowledge","authors":"Kyle Goslin, M. Hofmann","doi":"10.1145/3158354.3158356","DOIUrl":"https://doi.org/10.1145/3158354.3158356","url":null,"abstract":"This paper describes the benchmarking and analysis of five Automatic Search Query Enhancement (ASQE) algorithms that utilise Wikipedia as the sole source for a priori knowledge. The contributions of this paper include: 1) A comprehensive review into current ASQE algorithms that utilise Wikipedia as the sole source for a priori knowledge; 2) benchmarking of five existing ASQE algorithms using the TREC-9 Web Topics on the ClueWeb12 data set and 3) analysis of the results from the benchmarking process to identify the strengths and weaknesses each algorithm. During the benchmarking process, 2,500 relevance assessments were performed. Results of these tests are analysed using the Average Precision @10 per query and Mean Average Precision @10 per algorithm. From this analysis we show that the scope of a priori knowledge utilised during enhancement and the available term weighting methods available from Wikipedia can further aid the ASQE process. Although approaches taken by the algorithms are still relevant, an over dependence on weighting schemes and data sources used can easily impact results of an ASQE algorithm.","PeriodicalId":306212,"journal":{"name":"Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115631278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Segmentation of Merged Lines and Script Identification in Handwritten Bilingual Documents 手写体双语文档中合并行分割与文字识别

Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation Pub Date : 2017-12-08 DOI: 10.1145/3158354.3158360

Ranjana S. Zinjore, R. Ramteke, Varsha M. Pathak

引用次数: 2

Language Identification in Mixed Script 混合文字中的语言识别

Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation Pub Date : 2017-12-08 DOI: 10.1145/3158354.3158357

Nagesh Bhattu Sristy, N. S. Krishna, B. S. Krishna, V. Ravi

{"title":"Language Identification in Mixed Script","authors":"Nagesh Bhattu Sristy, N. S. Krishna, B. S. Krishna, V. Ravi","doi":"10.1145/3158354.3158357","DOIUrl":"https://doi.org/10.1145/3158354.3158357","url":null,"abstract":"The text exchanged in social media conversations is often noisy with a mixture of stylistic and misspelt variations of original words. Any standard NLP techniques applied on such data such as POS tagging, Named entity recognition suffer because of noisy nature of the input. Usage of mixed script text is also prevalent in social media users. The current work addresses the identification of language at word level in mixed script scenarios, where all the text is written in roman script but the words being used by the users are transliterations of original words in native language into english. The core part of the problem is identifying the language, looking at small fragments of text among a set of languages. We propose a two stage approach for word-level language identification. In the first stage a mixing language combination is identified by using character n-grams of the sentence. Second stage consists of using the previous mixing combination class to make the word level language identification. We apply Conditional Random Fields(CRF) further in second stage to improve the performance of the word level language identification. Such simplification is essential, otherwise the number of states of the model will be huge and resultant model predictions are very noisy. Our methods improve the F-score of word level language identification by over 10% compared to the base-line.","PeriodicalId":306212,"journal":{"name":"Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124529960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation Pub Date : 2017-12-08 DOI: 10.1145/3158354.3158355

Anirban Sen, Manjira Sinha, Sandya Mannarswamy

{"title":"Improving Similar Question Retrieval using a Novel Tripartite Neural Network based Approach","authors":"Anirban Sen, Manjira Sinha, Sandya Mannarswamy","doi":"10.1145/3158354.3158355","DOIUrl":"https://doi.org/10.1145/3158354.3158355","url":null,"abstract":"Collective intelligence of the crowds is distilled together in various Community Question Answering (CQA) Services such as Quora, Yahoo Answers, Stack Overflow forums, wherein users share their knowledge, providing both informational and experiential support to other users. As users often search for similar information, probabilities are high that for a new incoming question, there is a related question-answer pair existing in the CQA dataset. Therefore, an efficient technique for similar question identification is need of the hour. While data is not a bottleneck in this scenario, addressing the vocabulary diversity generated by a variety pool of users certainly is. This paper proposes a novel tripartite neural network based approach towards the similar question retrieval problem. The network takes inputs in the form of question-answer and new question triplet and learns internal representations from similarities among them. Our approach achieves classification performances upto 77% on a real world CQA dataset.We have also compared our method with two other baselines and found that it performs significantly better in handling the problem of vocabulary diversity and 'zero-lexical overlap' among questions.","PeriodicalId":306212,"journal":{"name":"Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121387328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation 第九届信息检索评估论坛年会论文集

Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation Pub Date : 1900-01-01 DOI: 10.1145/3158354

引用次数: 0