RIAO Conference最新文献

筛选
英文 中文
Using Prior Information Derived from Citations in Literature Search 利用文献检索中引文的先验信息
RIAO Conference Pub Date : 2007-05-30 DOI: 10.5555/1931390.1931454
E. Meij, M. de Rijke
{"title":"Using Prior Information Derived from Citations in Literature Search","authors":"E. Meij, M. de Rijke","doi":"10.5555/1931390.1931454","DOIUrl":"https://doi.org/10.5555/1931390.1931454","url":null,"abstract":"Researchers spend a large amount of their time searching through an ever increasing number of scientific articles. Although users of scientific literature search engines prefer the ranking of results according to the number of citations a publication has received, it is unknown whether this notion of authoritativeness could also benefit more traditional and objective measures. Is it also an indicator of relevance, given an information need? In this paper, we examine the relationship between citation features of a scientific article and its prior probability of actually being relevant to an information need. We propose various ways of modeling this relationship and show how this kind of contextual information can be incorporated within a language modeling framework. We experiment with three document priors, which we evaluate on three distinct sets of queries and two document collections from the TREC Genomics track. Empirical results show that two of the proposed priors can significantly improve retrieval effectiveness, measured in terms of mean average precision.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123340687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Using the Knowledge of Object Colors to Segment Images and Improve Web Image Search 利用物体颜色知识分割图像,改进网络图像搜索
RIAO Conference Pub Date : 2007-05-30 DOI: 10.5555/1931390.1931443
C. Millet, I. Bloch
{"title":"Using the Knowledge of Object Colors to Segment Images and Improve Web Image Search","authors":"C. Millet, I. Bloch","doi":"10.5555/1931390.1931443","DOIUrl":"https://doi.org/10.5555/1931390.1931443","url":null,"abstract":"With web image search engines, we face a situation where the results are very noisy, and when we ask for a specific object, we are not ensured that this object is contained in all the images returned by the search engines: about 50% of the images returned are off-topic. In this paper, we explain how knowing the color of an object can help locating the object in images, and we also propose methods to automatically find the color of an object, so that the whole process can be fully automatic. Results reveal that this method allows us to reduce the noise in returned images while providing automatic segmentation so that it can be used for clustering or object learning.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129050588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Query Refinement based on Topical Term Clustering 基于主题词聚类的查询细化
RIAO Conference Pub Date : 2007-05-30 DOI: 10.5555/1931390.1931405
Hiromi Wakaki, Tomonari Masada, A. Takasu, J. Adachi
{"title":"Query Refinement based on Topical Term Clustering","authors":"Hiromi Wakaki, Tomonari Masada, A. Takasu, J. Adachi","doi":"10.5555/1931390.1931405","DOIUrl":"https://doi.org/10.5555/1931390.1931405","url":null,"abstract":"We propose a method for supporting query refinement using topical term clusters. First, we propose a new term weighting method that can extract terms strongly related to a specific topic, because a document set retrieved with an ambiguous query may include divergent topics. Our formulation of term weighting is based on the statistics of term co-occurrence. Then, we generate term clusters using extracted terms, and rerank the documents in the search results by using each term cluster as a query. This clustering procedure is intended to isolate each topic as a set of related terms. In our experiments, we evaluated our term weighting method by checking: 1) whether each of the top-ranked document sets corresponds to one topic; and 2) whether some of the top-ranked document sets cover all the topics included in the synthesized document set. The results of our experiment show our method outperforms the existing term weighting methods MI, KLD, CHI-square and RSV.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131965314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Summarizing non-textual events with 'Briefing' focus 以“简报”为重点总结非文本事件
RIAO Conference Pub Date : 2007-05-30 DOI: 10.5555/1931390.1931411
Mohit Kumar, Dipanjan Das, Alexander I. Rudnicky
{"title":"Summarizing non-textual events with 'Briefing' focus","authors":"Mohit Kumar, Dipanjan Das, Alexander I. Rudnicky","doi":"10.5555/1931390.1931411","DOIUrl":"https://doi.org/10.5555/1931390.1931411","url":null,"abstract":"We describe a learning-based system for generating reports based on a mix of text and event data. The system incorporates several stages of processing, including aggregation, template-filling and importance ranking. Aggregators and templates were based on a corpus of reports evaluated by human judges. Importance and granularity were learned from this corpus as well. We find that high-scoring reports (with a recall of 0.89) can be reliably produced using this procedure given a set of oracle features. The report drafting system is part of a learning cognitive assistant RADAR, and is used to describe its performance.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128429003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Discovering Missing Values in Semi-Structured Databases 发现半结构化数据库中的缺失值
RIAO Conference Pub Date : 2007-05-30 DOI: 10.5555/1931390.1931456
Xing Yi, James Allan, V. Lavrenko
{"title":"Discovering Missing Values in Semi-Structured Databases","authors":"Xing Yi, James Allan, V. Lavrenko","doi":"10.5555/1931390.1931456","DOIUrl":"https://doi.org/10.5555/1931390.1931456","url":null,"abstract":"We explore the problem of discovering multiple missing values in a semi-structured database. For this task, we formally develop Structured Relevance Model (SRM) built on one hypothetical generative model for semi-structured records. SRM is based on the idea that plausible values for a given field could be inferred from the context provided by the other fields in the record. Small-scale experiments on IMDb (Internet Movie Database) show that SRM matched three state-of-the-art relational learning approaches on the movie label prediction tasks. Large-scale experiments on a snapshot of the National Science Digital Library (NSDL) repository show that SRM is highly effective at discovering possible values for free-text fields even with quite modest amounts of training data, compared with state-of-the-art machine learning approaches.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"23 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131097206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Structured Audio Player: Supporting Radio Archive Workflows with Automatically Generated Structure Metadata 结构化音频播放器:支持无线电存档工作流与自动生成的结构元数据
RIAO Conference Pub Date : 2007-05-30 DOI: 10.5555/1931390.1931416
M. Larson, J. Köhler
{"title":"Structured Audio Player: Supporting Radio Archive Workflows with Automatically Generated Structure Metadata","authors":"M. Larson, J. Köhler","doi":"10.5555/1931390.1931416","DOIUrl":"https://doi.org/10.5555/1931390.1931416","url":null,"abstract":"Although techniques to automatically generate metadata have been steadily refined over the past decade, archive professionals at radio broadcasters continue to use conventional audio players in order to screen and annotate radio material. In order to facilitate technology transfer, the archives departments of two large German radio broadcasters, Deutsche Welle and WDR, commissioned Fraunhofer IAIS to develop a prototype audio archive and to investigate the practical aspects of integrating automatically generated metadata into their existing workflows. The project identified the structuring of radio programs as the area in which automatically generated metadata has the clearest potential to support the work of archive staff. This paper discusses the development and performance of the structured audio player, the component of the audio archive system that demonstrates this potential. The automatically generated structured metadata includes speaker boundaries, speaker IDs, speaker gender and identification of audio segments not containing speech. In contrast to similar systems, our prototype was designed, developed and optimized in a project group composed of both archive professionals and multimedia researchers. As a result, important insights were gained into how automatically generated metadata should (and should not) be deployed to support the work of archivists preparing radio content for archival.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126707718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Cross-Media Entity Recognition in Nearly Parallel Visual and Textual Documents 近乎平行的视觉和文本文档中的跨媒体实体识别
RIAO Conference Pub Date : 2007-05-30 DOI: 10.5555/1931390.1931404
K. Deschacht, Marie-Francine Moens, Wouter Robeyns
{"title":"Cross-Media Entity Recognition in Nearly Parallel Visual and Textual Documents","authors":"K. Deschacht, Marie-Francine Moens, Wouter Robeyns","doi":"10.5555/1931390.1931404","DOIUrl":"https://doi.org/10.5555/1931390.1931404","url":null,"abstract":"We present a novel approach to automatically annotate images solely using associated text. We detect and classify all entities (persons and objects) in the text after which we determine the salience (the importance of an entity in a text) and visualness (the extent to which an entity can be perceived visually) of these entities. We combine these measures to compute the probability that an entity is present in the image. The suitability of our approach was successfully tested on 900 image-text pairs of Yahoo! News.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131769184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Modeling Information Scent: A Comparison of LSA, PMI and GLSA Similarity Measures on Common Tests and Corpora 信息气味建模:通用测试和语料库上LSA、PMI和GLSA相似度量的比较
RIAO Conference Pub Date : 2007-05-30 DOI: 10.5555/1931390.1931422
R. Budiu, Christiaan Royer, P. Pirolli
{"title":"Modeling Information Scent: A Comparison of LSA, PMI and GLSA Similarity Measures on Common Tests and Corpora","authors":"R. Budiu, Christiaan Royer, P. Pirolli","doi":"10.5555/1931390.1931422","DOIUrl":"https://doi.org/10.5555/1931390.1931422","url":null,"abstract":"In this paper we describe a comparison among three systems that estimate semantic similarity between words: Latent Semantic Analysis (Landauer & Dumais, 1997), Pointwise Mutual Information (Turney, 2001), and Generalized Latent Semantic Analysis (Matveeva, Levow, Farahat, & Royer, 2005). We compare all these techniques on a unique corpus (TASA) and, for PMI and GLSA, we also report performance on a larger web-based corpus. The evaluation is carried out through two kinds of tests: (1) synonymy tests, and (2) comparison with human word similarity judgments. The results indicate that for large corpora PMI works best on word similarity tests, and GLSA on synonymy tests. For the smaller TASA corpus, GLSA produced the best performance on most tests. A large corpus improved the performance of PMI, but, in most cases, did not improve that of GLSA.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131200415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Using a Content-and-Structure Oriented Method for Relevance Feedback in XML Retrieval 面向内容和结构的XML检索相关反馈方法
RIAO Conference Pub Date : 2007-05-30 DOI: 10.5555/1931390.1931460
L. Hlaoua, M. Boughanem, K. Pinel-Sauvagnat
{"title":"Using a Content-and-Structure Oriented Method for Relevance Feedback in XML Retrieval","authors":"L. Hlaoua, M. Boughanem, K. Pinel-Sauvagnat","doi":"10.5555/1931390.1931460","DOIUrl":"https://doi.org/10.5555/1931390.1931460","url":null,"abstract":"As opposed to traditional Information Retrieval (IR) which views whole documents as atomic units of retrieval, XML IR processes XML elements as possible units of retrieval. Many open issues appear when considering Relevance Feedback (RF) in XML documents. They are mainly related to the form of XML documents that mix content and structure and to the new granularity of information processed by the Information Retrieval Systems (IRS). Most of the RF approaches proposed in XML retrieval are simple adaptations of traditional RF to the new granularity of information. They enrich queries by adding terms extracted from relevant elements instead of terms extracted from whole documents. In this paper, we propose to extend the initial query by adding both content and structural constraints. Experiments are carried out with the INEX evaluation campaign and results show the interest of our method.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131216006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Combining linguistic indexes to improve the performances of information retrieval systems: a machine learning based solution 结合语言索引提高信息检索系统性能:一种基于机器学习的解决方案
RIAO Conference Pub Date : 2007-05-30 DOI: 10.5555/1931390.1931427
Fabienne Moreau, V. Claveau, P. Sébillot
{"title":"Combining linguistic indexes to improve the performances of information retrieval systems: a machine learning based solution","authors":"Fabienne Moreau, V. Claveau, P. Sébillot","doi":"10.5555/1931390.1931427","DOIUrl":"https://doi.org/10.5555/1931390.1931427","url":null,"abstract":"Taking into account in one same information retrieval system several linguistic indexes encoding morphological, syntactic, and semantic information seems a good idea to better grasp the semantic contents of large unstructured text collections and thus to increase performances of such a system. Therefore the problem raised is of knowing how to automatically and efficiently combine those different information in order to optimize their exploitations. To this end, we propose an original machine learning based method that is able to determine relevant documents in a collection for a given query, from their positions within the result lists obtained from each individual linguistic index, while automatically adapting its behavior to the characteristics of the query. The different experiments that are presented here prove the interest of our fusion method that merges the result lists, which offers more balanced precision-recall compromises and consequently obtains more stable results than those got by the better individual index.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132186201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信