Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering最新文献

筛选
英文 中文
Glyph spotting for mediaeval handwritings by template matching 模板匹配的中世纪手写体的字形识别
Jan-Hendrik Worch, Mathias Lawo, B. Gottfried
{"title":"Glyph spotting for mediaeval handwritings by template matching","authors":"Jan-Hendrik Worch, Mathias Lawo, B. Gottfried","doi":"10.1145/2361354.2361401","DOIUrl":"https://doi.org/10.1145/2361354.2361401","url":null,"abstract":"This paper reports on the analysis of different approaches in order to search for glyphs within handwritten mediaeval documents. As layout analysis methods are difficult to apply to the documents at hand, template matching methods are employed. A number of different shape descriptions are used to filter out false positives, since the application of correlation coefficients alone results in too many matches. The overall goal consists in the interactive support of an editor who is transcribing a given handwriting. For this purpose, the automatic spotting of glyphs enables the editor to compare glyphs within different contexts.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"60 1","pages":"213-216"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82632172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Structured and fragmented content in collaborative XML publishing chains 协作XML发布链中的结构化和碎片化内容
Stéphane Crozat
{"title":"Structured and fragmented content in collaborative XML publishing chains","authors":"Stéphane Crozat","doi":"10.1145/2361354.2361388","DOIUrl":"https://doi.org/10.1145/2361354.2361388","url":null,"abstract":"In this paper, we present the main results of the C2M project through one of its operational deliverable: the Scenari4 collaborative editing and publishing system for XML content. The purpose of the C2M project was to design a system able to manage structured and fragmented contents - as XML publishing chains do - while providing collaborative possibilities - as Enterprise Content Management systems (ECM) do. The main issue is related to transclusion relationships which are massively used in XML publishing chains, in order to support repurposing without copying. This approach is not compatible with the classical way ECMs manage content, especially in terms of propagation of modifications, rights or transactions management. We propose two complementary solutions to manage two different levels of collaboration. The workspace is designed as a highly dynamic place able to deal with live fragments, linked together in a network, that can be easily updated at any time by any user. The library is a more static and more classical way to manage content, dedicated to folder-documents, which are XML frozen versions of sub-networks extracted from workspaces. While workspaces are dedicated to content elaboration and maintenance, libraries are places to store, to read, or to exchange stable documents. Scenari4 is released under FLOSS license and has been being used in several experimental and commercial contexts since the beginning of 2012.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"33 1","pages":"145-148"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83641509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Faceted documents: describing document characteristics using semantic lenses 分面文档:使用语义透镜描述文档特征
S. Peroni, D. Shotton, F. Vitali
{"title":"Faceted documents: describing document characteristics using semantic lenses","authors":"S. Peroni, D. Shotton, F. Vitali","doi":"10.1145/2361354.2361396","DOIUrl":"https://doi.org/10.1145/2361354.2361396","url":null,"abstract":"The semantic enhancement of a traditional scientific paper is not a straightforward operation, since it involves many different aspects or facets. In this paper we propose eight different semantic lenses through which these facets may be viewed, and describe and exemplify the ontologies by which these lenses may be implemented.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"1 1","pages":"191-194"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91525914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
A framework for retrieval and annotation in digital humanities using XQuery full text and update in BaseX 一个基于XQuery的数字人文学科全文检索和注释框架
C. Mahlow, C. Grün, Alexander Holupirek, M. Scholl
{"title":"A framework for retrieval and annotation in digital humanities using XQuery full text and update in BaseX","authors":"C. Mahlow, C. Grün, Alexander Holupirek, M. Scholl","doi":"10.1145/2361354.2361398","DOIUrl":"https://doi.org/10.1145/2361354.2361398","url":null,"abstract":"A key difference between traditional humanities research and the emerging field of digital humanities is that the latter aims to complement qualitative methods with quantitative data. In linguistics, this means the use of large corpora of text, which are usually annotated automatically using natural language processing tools. However, these tools do not exist for historical texts, so scholars have to work with unannotated data. We have developed a system for systematic, iterative exploration and annotation of historical text corpora, which relies on an XML database (BaseX) and in particular on the Full Text and Update facilities of XQuery.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"33 1","pages":"195-204"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91278779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A methodology for evaluating algorithms for table understanding in PDF documents 一种评估PDF文档中表理解算法的方法
Max C. Göbel, Tamir Hassan, Ermelinda Oro, G. Orsi
{"title":"A methodology for evaluating algorithms for table understanding in PDF documents","authors":"Max C. Göbel, Tamir Hassan, Ermelinda Oro, G. Orsi","doi":"10.1145/2361354.2361365","DOIUrl":"https://doi.org/10.1145/2361354.2361365","url":null,"abstract":"This paper presents a methodology for the evaluation of table understanding algorithms for PDF documents. The evaluation takes into account three major tasks: table detection, table structure recognition and functional analysis. We provide a general and flexible output model for each task along with corresponding evaluation metrics and methods. We also present a methodology for collecting and ground-truthing PDF documents based on consensus-reaching principles and provide a publicly available ground-truthed dataset.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"261 1","pages":"45-48"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76740919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 61
Structure-conforming XML document transformation based on graph homomorphism 基于图同态的符合结构的XML文档转换
Tyng-Ruey Chuang, Hui-Yin Wu
{"title":"Structure-conforming XML document transformation based on graph homomorphism","authors":"Tyng-Ruey Chuang, Hui-Yin Wu","doi":"10.1145/2361354.2361376","DOIUrl":"https://doi.org/10.1145/2361354.2361376","url":null,"abstract":"We propose a principled method to specify XML document transformation so that the outcome of a transformation can be ensured to conform to certain structural constraints as required by the target XML document type. We view XML document types as graphs, and model transformations as relations between the two graphs. Starting from this abstraction, we use and extend graph homomorphism as a formalism for the specifications of transformations between XML document types. A specification can then be checked to ensure whether results from the transformation will always be structure-conforming.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"19 1","pages":"99-102"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75567107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Receipts2Go: the big world of small documents Receipts2Go:小文档的大世界
Bill Janssen, E. Saund, E. Bier, Patricia Wall, M. Sprague
{"title":"Receipts2Go: the big world of small documents","authors":"Bill Janssen, E. Saund, E. Bier, Patricia Wall, M. Sprague","doi":"10.1145/2361354.2361381","DOIUrl":"https://doi.org/10.1145/2361354.2361381","url":null,"abstract":"The Receipts2Go system is about the world of one-page documents: cash register receipts, book covers, cereal boxes, price tags, train tickets, fire extinguisher tags. In that world, we're exploring techniques for extracting accurate information from documents for which we have no layout descriptions -- indeed no initial idea of what the document's genre is -- using photos taken with cell phone cameras by users who aren't skilled document capture technicians. This paper outlines the system and reports on some initial results, including the algorithms we've found useful for cleaning up those document images, and the techniques used to extract and organize relevant information from thousands of similar-but-different page layouts.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"146 1","pages":"121-124"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72714221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
DocExplore: overcoming cultural and physical barriers to access ancient documents 文献探索:克服文化和物理障碍,获取古代文献
Pierrick Tranouez, Stéphane Nicolas, Vladislavs Dovgalecs, A. Burnett, L. Heutte, Yiqing Liang, R. Guest, M. Fairhurst
{"title":"DocExplore: overcoming cultural and physical barriers to access ancient documents","authors":"Pierrick Tranouez, Stéphane Nicolas, Vladislavs Dovgalecs, A. Burnett, L. Heutte, Yiqing Liang, R. Guest, M. Fairhurst","doi":"10.1145/2361354.2361399","DOIUrl":"https://doi.org/10.1145/2361354.2361399","url":null,"abstract":"In this paper, we describe DocExplore, an integrated software suite centered on the handling of digitized documents with an emphasis on ancient manuscripts. This software suite allows the augmentation and exploration of ancient documents of cultural interest. Specialists can add textual and multimedia data and metadata to digitized documents through a graphical interface that does not require technical knowledge. They are helped in this endeavor by sophisticated document analysis tools that allows for instance to spot words or patterns in images of documents. The suite is intended to ease considerably the process of bringing locked away historical materials to the attention of the general public by covering all the steps from managing a digital collection to creating interactive presentations suited for cultural exhibitions. Its genesis and sustained development reside in a collaboration of archivists, historians and computer scientists, the latter being not only in charge of the development of the software, but also of creating and incorporating novel pattern recognition for document analysis techniques.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"4 1","pages":"205-208"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88207589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
TAL processor for hypermedia applications 用于超媒体应用的TAL处理器
Carlos de Salles Soares Neto, H. F. Pinto, L. Soares
{"title":"TAL processor for hypermedia applications","authors":"Carlos de Salles Soares Neto, H. F. Pinto, L. Soares","doi":"10.1145/2361354.2361369","DOIUrl":"https://doi.org/10.1145/2361354.2361369","url":null,"abstract":"TAL (Template Authoring Language) is a specification language for hypermedia document templates. Templates describe application families with structural and semantic similarities. In TAL, templates not only define design patterns that applications must follow, but also constraints on the use of these patterns. A template must be processed together with a padding document giving rise to a new document in some specification language, called target language. TAL supports the description of templates independently of the languages used to specify target and padding documents. Usually a specific processor is required for each target language and for each padding document used. This paper concerns TAL processors. However, we should note that the proposal can be easily extended to any other solution used to define templates. Any pattern language and any language used to define constraints could be used instead of TAL. The TAL processor architecture is general and it is discussed when presenting the processor framework. As an instantiation example, an implementation of a TAL Processor targeting NCL (the declarative language of Ginga DTV middleware) is examined, and also another one targeting HTML-based middleware. The use of wizards for defining padding documents is also discussed in the examples of the proposed architecture instantiation.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"114 1","pages":"69-78"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80085378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
XML query-update independence analysis revisited 重新审视XML查询-更新独立性分析
Muhammad Junedi, P. Genevès, Nabil Layaïda
{"title":"XML query-update independence analysis revisited","authors":"Muhammad Junedi, P. Genevès, Nabil Layaïda","doi":"10.1145/2361354.2361375","DOIUrl":"https://doi.org/10.1145/2361354.2361375","url":null,"abstract":"XML transformations can be resource-costly in particular when applied to very large XML documents and document sets. Those transformations usually involve lots of XPath queries and may not need to be entirely re-executed following an update of the input document. In this context, a given query is said to be independent of a given update if, for any XML document, the results of the query are not affected by the update. We revisit Benedikt and Cheney's framework for query-update independence analysis and show that performance can be drastically enhanced, contradicting their initial claims. The essence of our approach and results resides in the use of an appropriate logic, to which queries and updates are both succinctly translated. Compared to previous approaches, ours is more expressive from a theoretical point of view, equally accurate, and more efficient in practice. We illustrate this through practical experiments and comparative figures.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"123 1","pages":"95-98"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75810826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信