International Workshop On Research Issues in Digital Libraries最新文献

筛选
英文 中文
On the science of search: statistical approaches, evaluation, optimisation 关于搜索的科学:统计方法,评估,优化
International Workshop On Research Issues in Digital Libraries Pub Date : 2006-12-12 DOI: 10.1145/1364742.1364745
S. Robertson
{"title":"On the science of search: statistical approaches, evaluation, optimisation","authors":"S. Robertson","doi":"10.1145/1364742.1364745","DOIUrl":"https://doi.org/10.1145/1364742.1364745","url":null,"abstract":"This paper, based on a talk, presents an overview of evaluation experiments in information retrieval, and also of statistical approaches to search. A strong connection exists between them: the notion that the objective of search can be expressed in terms of the measures used for evaluation informs the statistical theory in several ways. The latest manifestation of this connection is the work on optimization of ranking algorithms, using machine learning techniques.","PeriodicalId":287514,"journal":{"name":"International Workshop On Research Issues in Digital Libraries","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115845661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
How to compose a complex document recognition system 如何组成一个复杂的文档识别系统
International Workshop On Research Issues in Digital Libraries Pub Date : 2006-12-12 DOI: 10.1145/1364742.1364759
H. Fujisawa
{"title":"How to compose a complex document recognition system","authors":"H. Fujisawa","doi":"10.1145/1364742.1364759","DOIUrl":"https://doi.org/10.1145/1364742.1364759","url":null,"abstract":"The technical challenges in document analysis and recognition have been to solve the problems of uncertainty and variability. From our experiences in developing OCRs, business form readers, and postal address recognition engines, we would like to present design principles to cope with these problems of uncertainty and variability. When the targets of document recognition are complex and diversified, the recognition engine needs to solve many different kinds of pattern recognition problems, which are a reflection of uncertainty and variability. Inevitably, the engine becomes complex, raising a question of how to combine its subcomponents, which are not perfect in their accuracies. The design principles will be explained with examples in postal address recognition.","PeriodicalId":287514,"journal":{"name":"International Workshop On Research Issues in Digital Libraries","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132289633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Finding an answer to a question 寻找问题的答案
International Workshop On Research Issues in Digital Libraries Pub Date : 2006-12-12 DOI: 10.1145/1364742.1364751
Brigitte Grau
{"title":"Finding an answer to a question","authors":"Brigitte Grau","doi":"10.1145/1364742.1364751","DOIUrl":"https://doi.org/10.1145/1364742.1364751","url":null,"abstract":"The huge quantity of available electronic information leads to a growing need for users to have tools able to be precise and selective. These kinds of tools have to provide answers to requests quite rapidly without requiring the user to explore each document, to reformulate her request or to seek for the answer inside documents. From that viewpoint, finding an answer consists not only in finding relevant documents but also in extracting relevant parts. This leads us to express the question-answering problem in terms of an information retrieval problem that can be solved using natural language processing (NLP) approaches. In my talk, I will focus on defining what a \"good\" answer is, and how a system can find it.\u0000 A good answer has to give the required piece of information. However, it is not sufficient; it also has both to be presented within its context of interpretation and to be justified in order to give a user means to evaluate if the answer fits her needs and is appropriate.\u0000 One can view searching an answer to a question as a reformulation problem: according to what is asked, find one of the different linguistic expressions of the answer in all candidate sentences. Within this framework, interlingual question-answering can also be seen as another kind of linguistic variation. The answer phrasing can be considered as an affirmative reformulation of the question, partly or totally, which entails the definition of models that match with sentences containing the answer. According to the different approaches, the kinds of model and the matching criteria greatly differ. It can consist in building a structured representation that makes explicit the semantic relations between the concepts of the question and that is compared to a similar representation of sentences. As this approach requires a syntactic parser and a semantic knowledge base, which are not always available in all the languages, systems often apply a less formal approach based on a similarity measure between a passage and the question and answers are extracted from highest scored passages. Similarity involves different criteria: question terms and their linguistic variations in passages, syntactic proximity, answer type. We will see that, in such an approach, justifications can be envisioned by using text themselves, considered as depositories of semantic knowledge. I will focus on the approach the LIR group of LIMSI has taken for its monolingual and bilingual systems.","PeriodicalId":287514,"journal":{"name":"International Workshop On Research Issues in Digital Libraries","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124011340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Information retrieval and digital libraries: lessons of research 信息检索与数字图书馆:研究的经验教训
International Workshop On Research Issues in Digital Libraries Pub Date : 2006-12-12 DOI: 10.1145/1364742.1364743
Karen Spärck Jones
{"title":"Information retrieval and digital libraries: lessons of research","authors":"Karen Spärck Jones","doi":"10.1145/1364742.1364743","DOIUrl":"https://doi.org/10.1145/1364742.1364743","url":null,"abstract":"This paper reviews lessons from the history of information retrieval research, with particular emphasis on recent developments. These have demonstrated the value of statistical techniques for retrieval, and have also shown that they have an important, though not exclusive, part to play in other information processing tasks, like question asnwering and summarising. The heterogeneous materials that digital libraries are expected to cover, their scale, and their changing composition, imply that statistical methods, which are general-purpose and very flexible, have significant potential value for the digital libraries of the future.","PeriodicalId":287514,"journal":{"name":"International Workshop On Research Issues in Digital Libraries","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129187307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Open source search and research 开源搜索和研究
International Workshop On Research Issues in Digital Libraries Pub Date : 2006-12-12 DOI: 10.1145/1364742.1364748
M. Beigbeder, Wray L. Buntine, Wai Gen Yee
{"title":"Open source search and research","authors":"M. Beigbeder, Wray L. Buntine, Wai Gen Yee","doi":"10.1145/1364742.1364748","DOIUrl":"https://doi.org/10.1145/1364742.1364748","url":null,"abstract":"In this paper, we present a review of criteria for the evaluation of open source information retrieval tools and provide an overview of some of those that are more popular. The question of interaction between research and availability of open source search tools is addressed.","PeriodicalId":287514,"journal":{"name":"International Workshop On Research Issues in Digital Libraries","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129845071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Digital audiovisual repositories: an introduction 数字音像资源库:介绍
International Workshop On Research Issues in Digital Libraries Pub Date : 2006-12-12 DOI: 10.1145/1364742.1364753
Richard Wright
{"title":"Digital audiovisual repositories: an introduction","authors":"Richard Wright","doi":"10.1145/1364742.1364753","DOIUrl":"https://doi.org/10.1145/1364742.1364753","url":null,"abstract":"This paper briefly describes the essential aspects of the digital world that audiovisual archives are entering - or being swallowed-up in. The crucial issue is whether archives will sink or swim in this all-digital environment. The core issue is defining - and meeting - the requirements for a secure, sustainable digital repository.","PeriodicalId":287514,"journal":{"name":"International Workshop On Research Issues in Digital Libraries","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122269774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
From CLIR to CLIE: some lessons in NTCIR evaluation 从CLIR到CLIE: NTCIR评价的几点启示
International Workshop On Research Issues in Digital Libraries Pub Date : 2006-12-12 DOI: 10.1145/1364742.1364762
Hsin-Hsi Chen
{"title":"From CLIR to CLIE: some lessons in NTCIR evaluation","authors":"Hsin-Hsi Chen","doi":"10.1145/1364742.1364762","DOIUrl":"https://doi.org/10.1145/1364742.1364762","url":null,"abstract":"Cross-language information retrieval (CLIR) facilitates the use of one language to access documents in other languages. Cross-language information extraction (CLIE) extracts relevant information in finer granularity from multilingual documents for some specific applications like summarization, question answering, opinion extraction, etc. This paper reviews CLIR, CLQA, and opinion analysis tasks in NTCIR evaluation. The design methodologies and some key technologies are reported.","PeriodicalId":287514,"journal":{"name":"International Workshop On Research Issues in Digital Libraries","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127034446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Shallow syntax analysis in Sanskrit guided by semantic nets constraints 语义网约束下的梵文浅语法分析
International Workshop On Research Issues in Digital Libraries Pub Date : 2006-12-12 DOI: 10.1145/1364742.1364750
G. Huet
{"title":"Shallow syntax analysis in Sanskrit guided by semantic nets constraints","authors":"G. Huet","doi":"10.1145/1364742.1364750","DOIUrl":"https://doi.org/10.1145/1364742.1364750","url":null,"abstract":"We present the state of the art of a computational platform for the analysis of classical Sanskrit. The platform comprises modules for phonology, morphology, segmentation and shallow syntax analysis, organized around a structured lexical database. It relies on the Zen toolkit for finite state automata and transducers, which provides data structures and algorithms for the modular construction and execution of finite state machines, in a functional framework.\u0000 Some of the layers proceed in bottom-up synthesis mode - for instance, noun and verb morphological modules generate all inflected forms from stems and roots listed in the lexicon. Morphemes are assembled through internal sandhi, and the inflected forms are stored with morphological tags in dictionaries usable for lemmatizing. These dictionaries are then compiled into transducers, implementing the analysis of external sandhi, the phonological process which merges words together by euphony. This provides a tagging segmenter, which analyses a sentence presented as a stream of phonemes and produces a stream of tagged lexical entries, hyperlinked to the lexicon.\u0000 The next layer is a syntax analyser, guided by semantic nets constraints expressing dependencies between the word forms. Finite verb forms demand semantic roles, according to valency patterns depending on the voice (active, passive) of the form and the governance (transitive, etc) of the root. Conversely, noun/adjective forms provide actors which may fill those roles, provided agreement constraints are satisfied. Tool words are mapped to transducers operating on tagged streams, allowing the modeling of linguistic phenomena such as coordination by abstract interpretation of actor streams. The parser ranks the various interpretations (matching actors with roles) with penalties, and returns to the user the minimum penalty analyses, for final validation of ambiguities. The whole platform is organized as a Web service, allowing the piecewise tagging of a Sanskrit text.","PeriodicalId":287514,"journal":{"name":"International Workshop On Research Issues in Digital Libraries","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131164032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Toward a common semantics between media and languages 媒体和语言之间的共同语义
International Workshop On Research Issues in Digital Libraries Pub Date : 2006-12-12 DOI: 10.1145/1364742.1364755
C. Fluhr, G. Grefenstette, Adrian Daniel Popescu
{"title":"Toward a common semantics between media and languages","authors":"C. Fluhr, G. Grefenstette, Adrian Daniel Popescu","doi":"10.1145/1364742.1364755","DOIUrl":"https://doi.org/10.1145/1364742.1364755","url":null,"abstract":"For a computer to recognize objects, persons, situations or actions in multimedia, it needs to have learned models of each thing beforehand. For the moment, no large, general collection of training examples exists for the wide variety of things that we would want to automatically recognize in multimedia, video and still images. We believe that the WWW and current technology can allow us to automatically build such a resource. This paper describes a methodology for the construction of a grounded, general purpose, multimedia ontology that is instantiated through web processing. In this hierarchically organized ontology, concepts corresponding to concrete objects, persons, situations and actions are linked with still images, videos and sounds that represent exemplars of each concept. These examples are necessary resources for computing discriminating signatures for the recognition of the concepts in still images or videos. Since images retrieved using existing image search engines contain much noise hand are not always representative, we also present here our methodology for finding good representative for each concept.","PeriodicalId":287514,"journal":{"name":"International Workshop On Research Issues in Digital Libraries","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125048554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Multilingual information access: the contribution of evaluation 多语种信息获取:评价的贡献
International Workshop On Research Issues in Digital Libraries Pub Date : 2006-12-12 DOI: 10.1145/1364742.1364761
C. Peters
{"title":"Multilingual information access: the contribution of evaluation","authors":"C. Peters","doi":"10.1145/1364742.1364761","DOIUrl":"https://doi.org/10.1145/1364742.1364761","url":null,"abstract":"Since evaluation of cross-language information retrieval systems began at TREC in 1997 and NTCIR in 1998 and, in particular, with the launch of the Cross-Language Evaluation Forum (CLEF) in 2000, considerable progress has been made in this particular sector of IR. Advances can be considered in two stages. The first stage regarded in particular the development of text retrieval systems from simple so-called \"bilingual\" systems in which a query in one language is used to search a document collection in another to truly \"multilingual\" retrieval systems where a query in one language can find relevant results from a collection of documents in multiple languages. In the second stage, the focus was no longer just on multilingual document retrieval but was diversified to include different kinds of text retrieval across languages (e.g multilingual question answering) and retrieval on different kinds of media (e.g. collections containing images or speech). However, although the results from the research perspective have been interesting, there has been little real take-up by the applications communities. In the paper we describe the results achieved by CLEF over the years and propose a third stage for multilingual system evaluation which gives far more attention to questions regarding usability and user satisfaction but also provides ways for the results achieved to be transferred to the operational context.","PeriodicalId":287514,"journal":{"name":"International Workshop On Research Issues in Digital Libraries","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128263844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信