Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering最新文献

The Notarial Archives, Valletta: Starting from Zero 瓦莱塔公证档案:从零开始

Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering Pub Date : 2017-08-31 DOI: 10.1145/3103010.3103025

T. Lupi

引用次数: 0

Ruling analysis and classification of torn documents 撕裂文件的判读分析与分类

Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering Pub Date : 2014-09-16 DOI: 10.1145/2644866.2644876

Markus Diem, Florian Kleber, Robert Sablatnig

引用次数: 0

Classifying and ranking search engine results as potential sources of plagiarism 分类和排名搜索引擎结果作为潜在的剽窃来源

Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering Pub Date : 2014-09-16 DOI: 10.1145/2644866.2644879

Kyle Williams, Hung-Hsuan Chen, C. Lee Giles

引用次数: 12

SimSeerX: a similar document search engine SimSeerX:一个类似的文档搜索引擎

Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering Pub Date : 2014-09-16 DOI: 10.1145/2644866.2644895

Kyle Williams, Jian Wu, C. Lee Giles

引用次数: 13

JAR tool: using document analysis for improving the throughput of high performance printing environments JAR工具:利用文档分析提高吞吐量的高性能打印环境

Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering Pub Date : 2014-09-16 DOI: 10.1145/2644866.2644887

M. Kolberg, L. G. Fernandes, Mateus Raeder, Carolina Fonseca

{"title":"JAR tool: using document analysis for improving the throughput of high performance printing environments","authors":"M. Kolberg, L. G. Fernandes, Mateus Raeder, Carolina Fonseca","doi":"10.1145/2644866.2644887","DOIUrl":"https://doi.org/10.1145/2644866.2644887","url":null,"abstract":"Digital printers have consistently improved their speed in the past years. Meanwhile, the need for document personalization and customization has increased. As a consequence of these two facts, the traditional rasterization process has become a highly demanding computational step in the printing workflow. Moreover, Print Service Providers are now using multiple RIP engines to speed up the whole document rasterization process, and depending on the input document characteristics the rasterization process may not achieve the print-engine speed creating a unwanted bottleneck. In this scenario, we developed a tool called Job Adaptive Router (JAR) aiming at improving the throughput of the rasterization process through a clever load balance among RIP engines which is based on information obtained by the analysis of input documents content. Furthermore, along with this tool we propose some strategies that consider relevant characteristics of documents, such as transparency and reusability of images, to split the job in a more intelligent way. The obtained results confirm that the use of the proposed tool improved the rasterization process performance.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"25 1","pages":"175-178"},"PeriodicalIF":0.0,"publicationDate":"2014-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77664085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

What academics want when reading digitally 学者们想要的数字阅读

Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering Pub Date : 2014-09-16 DOI: 10.1145/2644866.2644894

Juliane Franze, K. Marriott, Michael Wybrow

引用次数: 17

A platform for language independent summarization 一个独立于语言的摘要平台

Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering Pub Date : 2014-09-16 DOI: 10.1145/2644866.2644890

L. Cabral, R. Lins, R. Mello, F. Freitas, B. T. Ávila, S. Simske, M. Riss

引用次数: 13

Extracting web content for personalized presentation 为个性化表示提取web内容

Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering Pub Date : 2014-09-16 DOI: 10.1145/2644866.2644871

Rodrigo Chamun, Daniele Pinheiro, Diego Jornada, J. B. Oliveira, I. Manssour

{"title":"Extracting web content for personalized presentation","authors":"Rodrigo Chamun, Daniele Pinheiro, Diego Jornada, J. B. Oliveira, I. Manssour","doi":"10.1145/2644866.2644871","DOIUrl":"https://doi.org/10.1145/2644866.2644871","url":null,"abstract":"Printing web pages is usually a thankless task as the result is often a document with many badly-used pages and poor layout. Besides the actual content, superfluous web elements like menus and links are often present and in a printed version they are commonly perceived as an annoyance. Therefore, a solution for obtaining cleaner versions for printing is to detect parts of the page that the reader wants to consume, eliminating unnecessary elements and filtering the \"true\" content of the web page. In addition, the same solution may be used online to present cleaner versions of web pages, discarding any elements that the user wishes to avoid.\u0000 In this paper we present a novel approach to implement such filtering. The method is interactive at first: The user samples items that are to be preserved on the page and thereafter everything that is not similar to the samples is removed from the page. This is achieved by comparing the path of all elements on the DOM representation of the page with the path of the elements sampled by the user and preserving only elements that have a path \"similar\" to the sample. The introduction of a similarity measure adds an important degree of adaptability to the needs of different users and applications.\u0000 This approach is quite general and may be applied to any XML tree that has labeled nodes. We use HTML as a case study and present a Google Chrome extension that implements the approach as well as a user study comparing our results with commercial results.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"72 1","pages":"157-164"},"PeriodicalIF":0.0,"publicationDate":"2014-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86225398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Automated refactoring for size reduction of CSS style sheets 自动重构CSS样式表的大小

Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering Pub Date : 2014-09-16 DOI: 10.1145/2644866.2644885

Martí Bosch, P. Genevès, Nabil Layaïda

{"title":"Automated refactoring for size reduction of CSS style sheets","authors":"Martí Bosch, P. Genevès, Nabil Layaïda","doi":"10.1145/2644866.2644885","DOIUrl":"https://doi.org/10.1145/2644866.2644885","url":null,"abstract":"Cascading Style Sheets (CSS) is a standard language for stylizing and formatting web documents. Its role in web user experience becomes increasingly important. However, CSS files tend to be designed from a result-driven point of view, without much attention devoted to the CSS file structure as long as it produces the desired results. Furthermore, the rendering intended in the browser is often checked and debugged with a document instance. Style sheets normally apply to a set of documents, therefore modifications added while focusing on a particular instance might affect other documents of the set.\u0000 We present a first prototype of static CSS semantical analyzer and optimizer that is capable of automatically detecting and removing redundant property declarations and rules. We build on earlier work on tree logics to locate redundancies due to the semantics of selectors and properties. Existing purely syntactic CSS optimizers might be used in conjunction with our tool, for performing complementary (and orthogonal) size reduction, toward the common goal of providing smaller and cleaner CSS files.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"10 1","pages":"13-16"},"PeriodicalIF":0.0,"publicationDate":"2014-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87175522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Generating summary documents for a variable-quality PDF document collection 为可变质量的PDF文档集合生成摘要文档

Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering Pub Date : 2014-09-16 DOI: 10.1145/2644866.2644892

Jacob Hughes, D. Brailsford, S. Bagley, C. Adams

{"title":"Generating summary documents for a variable-quality PDF document collection","authors":"Jacob Hughes, D. Brailsford, S. Bagley, C. Adams","doi":"10.1145/2644866.2644892","DOIUrl":"https://doi.org/10.1145/2644866.2644892","url":null,"abstract":"The Cochrane Schizophrenia Group's Register of studies details all aspects of the effects of treating people with schizophrenia. It has been gathered over the last 20 years and consists of around 20,000 documents, overwhelmingly in PDF. Document collections of this sort -- on a given theme but gathered from a wide range of sources -- will generally have huge variability in the quality of the PDF, particularly with respect to the key property of text searchability.\u0000 Summarising the results from the best of these papers, to allow evidence-based health care decision making, has so far been done by manually creating a summary document, starting from a visual inspection of the relevant PDF file. This labour-intensive process has resulted, to date, in only 4,000 of the papers being summarised -- with enormous duplication of effort and with many issues around the validity and reliability of the data extraction.\u0000 This paper describes a pilot project to provide a computer-assisted framework in which any of the PDF documents could be searched for the occurrence of some 8,000 keywords and key phrases. Once keyword tagging has been completed the framework assists in the generation of a standard summary document, thereby greatly speeding up the production of these summaries. Early examples of the framework are described and its capabilities illustrated.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"34 1","pages":"49-52"},"PeriodicalIF":0.0,"publicationDate":"2014-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74291369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5