{"title":"Automatic title generation for EM","authors":"Paul E. Kennedy, Alexander Hauptmann","doi":"10.1145/336597.336670","DOIUrl":"https://doi.org/10.1145/336597.336670","url":null,"abstract":"Our prototype automatic title generation system inspired by statistical machine-translation approaches [1] treats the document title like a translation of the document. Titles can be generated without extracting words from the document. A large corpus of documents with human-assigned titles is required for training title \"translation\" models. On an f1 evaluation score our approach outperformed another approach based on Bayesian probability estimates [7].","PeriodicalId":42447,"journal":{"name":"Digital Library Perspectives","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2000-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73503114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Snowball: extracting relations from large plain-text collections","authors":"Eugene Agichtein, L. Gravano","doi":"10.1145/336597.336644","DOIUrl":"https://doi.org/10.1145/336597.336644","url":null,"abstract":"Text documents often contain valuable structured data that is hidden Yin regular English sentences. This data is best exploited infavailable as arelational table that we could use for answering precise queries or running data mining tasks.We explore a technique for extracting such tables from document collections that requires only a handful of training examples from users. These examples are used to generate extraction patterns, that in turn result in new tuples being extracted from the document collection.We build on this idea and present our Snowball system. Snowball introduces novel strategies for generating patterns and extracting tuples from plain-text documents.At each iteration of the extraction process, Snowball evaluates the quality of these patterns and tuples without human intervention,and keeps only the most reliable ones for the next iteration. In this paper we also develop a scalable evaluation methodology and metrics for our task, and present a thorough experimental evaluation of Snowball and comparable techniques over a collection of more than 300,000 newspaper documents.","PeriodicalId":42447,"journal":{"name":"Digital Library Perspectives","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2000-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74993470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Growth and server availability of the NCSTRL digital library","authors":"Allison L. Powell, J. French","doi":"10.1145/336597.336696","DOIUrl":"https://doi.org/10.1145/336597.336696","url":null,"abstract":"This paper reports on measurements of the NCSTRL digital library taken over a two-year period. We report the growth of the system along two dimensions: number of participating institutions and number of documents indexed by the system. We also report an aspect of reliability for this distributed digital library system.","PeriodicalId":42447,"journal":{"name":"Digital Library Perspectives","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2000-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73632060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A digital museum of Taiwanese butterflies","authors":"Jen-Shin Hong, Herng-Yow Chen, J. Hsiang","doi":"10.1145/336597.336694","DOIUrl":"https://doi.org/10.1145/336597.336694","url":null,"abstract":"Taiwan is renown for its great variety of butterflies. There are about 400 species, a number of which unique to Taiwan, over its 36,500 sq km land. Last year we built a comprehensive digital collection of Taiwan's butterflies to provide a modern research environment on butterflies for academic institutions, as well as an interactive butterfly educational environment for the general public. Our digital museum emphasizes on the ease to use, and provides a number of innovative features to help the user fully utilize the information provided by the system. The digital museum is accessible through the Web at http://digimuse.nmns.edu.tw.","PeriodicalId":42447,"journal":{"name":"Digital Library Perspectives","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2000-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85917595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning the shape of information: a longitudinal study of Web-news reading","authors":"Misha W. Vaughan, A. Dillon","doi":"10.1145/336597.336673","DOIUrl":"https://doi.org/10.1145/336597.336673","url":null,"abstract":"A concept called shape is proposed to experimentally examine the development of users' mental representations of information spaces over time. Twenty five novice users are exposed to two differently designed news web sites over five sessions. The longitudinal impacts on users' comprehension, usability, and navigation are examined.","PeriodicalId":42447,"journal":{"name":"Digital Library Perspectives","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2000-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86790861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Re-engineering structures from Web documents","authors":"Chuang-Hue Moh, Ee-Peng Lim, W. Ng","doi":"10.1145/336597.336638","DOIUrl":"https://doi.org/10.1145/336597.336638","url":null,"abstract":"To realize a wide range of applications (including digital libraries) on the Web, a more structured way of accessing the Web is required and such requirement can be facilitated by the use of XML standard. In this paper, we propose a general framework for reverse engineering (or re-engineering) the underlying structures i.e.,the DTD from a collection of similarly structured XML documents when they share some common but unknown DTDs. The essential data structures and algorithms for the DTD generation have been delveloped and experiments on real Web collections have been conducted to demonstrate their feasibilty. In addition, we also proposed a method ofimposing a constraint on the repetitiveness on the element in a DTD rule to further simplify the generated DTD without compromising their correctness.","PeriodicalId":42447,"journal":{"name":"Digital Library Perspectives","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2000-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87484029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Hitchcock, L. Carr, Z. Jiao, Donna Bergmark, W. Hall, C. Lagoze, S. Harnad
{"title":"Developing services for open eprint archives: globalisation, integration and the impact of links","authors":"S. Hitchcock, L. Carr, Z. Jiao, Donna Bergmark, W. Hall, C. Lagoze, S. Harnad","doi":"10.1145/336597.336655","DOIUrl":"https://doi.org/10.1145/336597.336655","url":null,"abstract":"The rapid growth of scholarly information resources available in electronic form and their organisation by digital libraries is proving fertile ground for the development of sophisticated new services, of which citation linking will be one indispensable example. Many new projects, partnerships and commercial agreements have been announced to build citation linking applications. This paper describes the Open Citation (OpCit) project, which will focus on linking papers held in freely accessible eprint archives such as the Los Alamos physics archives and other distributed archives, and which will build on the work of the Open Archives initiative to make the data held in such archives available to compliant services. The paper emphasises the work of the project in the context of emerging digital library information environments, explores how a range of new linking tools might be combined and identifies ways in which different linking applications might converge. Some early results of linked pages from the OpCit project are reported.","PeriodicalId":42447,"journal":{"name":"Digital Library Perspectives","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2000-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86196741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The open video project: research-oriented digital video repository","authors":"Gary Geisler, G. Marchionini","doi":"10.1145/336597.336693","DOIUrl":"https://doi.org/10.1145/336597.336693","url":null,"abstract":"A future with widespread access to large digital libraries of video is nearing reality. Anticipating this future, a great deal of research is focused on methods of browsing and retrieving digital video, developing algorithms for creating surrogates for video content, and creating interfaces that display result sets from multimedia queries. Research in these areas requires that each investigator acquire and digitize video for their studies since the multimedia information retrieval community does not yet have a standard collection of video to be used for research purposes. The primary goal of the Open Video Project is to create and maintain a shared digital video repository and test collection to meet these research needs.","PeriodicalId":42447,"journal":{"name":"Digital Library Perspectives","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2000-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89207378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A mediation infrastructure for digital library services","authors":"S. Melnik, H. Garcia-Molina, A. Paepcke","doi":"10.1145/336597.336651","DOIUrl":"https://doi.org/10.1145/336597.336651","url":null,"abstract":"Digital library mediators allow interoperation between diverse information services. In this paper we describe a flexible and dynamic mediator infrastructure that allows mediators to be composed from a set of modules (``blades''). Each module implements a particular mediation function, such as protocol translation, query translation, or result merging. All the information used by the mediator, including the mediator logic itself, is represented by an RDF graph.We illustrate our approach using a mediation scenario involving a Dienst and a Z39.50 server, and we discuss the potential advantages and weaknesses of our framework.","PeriodicalId":42447,"journal":{"name":"Digital Library Perspectives","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2000-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79855128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Browsing the structure of multimedia stories","authors":"R. Allen, Jane Acheson","doi":"10.1145/336597.336615","DOIUrl":"https://doi.org/10.1145/336597.336615","url":null,"abstract":"Stories may be analyzed as sequences of causally-related events and reactions to those events by the characters. We employ a notation of plot elements, similar to one developed by Lehnert,and we extend that by forming higher level ``story threads''Stories may be analyzed as sequences of causally-related events and reactions to those events by the characters. We employ a notation of plot elements, similar to one developed by Lehnert,and we extend that by forming higher level ``story threads''We apply the browser to Corduroy, a children's short feature which was analyzed in detail. We provide additional illustrations with analysis of Kiss of Death, a Film Noir classic. Effectively, the browser provides a framework for interactive summaries, video of the narrative","PeriodicalId":42447,"journal":{"name":"Digital Library Perspectives","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2000-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75920723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}