{"title":"An XML environment for multistructured textual documents","authors":"Emmanuel Bruno, Elisabeth Murisasco","doi":"10.1109/ICDIM.2007.4444228","DOIUrl":null,"url":null,"abstract":"XML is the de facto standard to describe structured data. Several applications in the context of information systems are based on its use: electronic publishing, technical documentation, digital libraries, web, etc. An XML document is mainly hierarchical. But, in some applications, several concurrent hierarchical structures could be associated to the same textual data. This paper presents an XML environment dedicated to the representation and the querying of such documents that we call multistructured textual documents. Our work aims at proposing a method for a compact representation of multiple trees over a single text based on segmentation. Segmentation encoding allows querying overlap/containment relations of markups belonging to different structures. This paper particularly focuses on the architecture of the XML environment implementing our proposals.","PeriodicalId":198626,"journal":{"name":"2007 2nd International Conference on Digital Information Management","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 2nd International Conference on Digital Information Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDIM.2007.4444228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
XML is the de facto standard to describe structured data. Several applications in the context of information systems are based on its use: electronic publishing, technical documentation, digital libraries, web, etc. An XML document is mainly hierarchical. But, in some applications, several concurrent hierarchical structures could be associated to the same textual data. This paper presents an XML environment dedicated to the representation and the querying of such documents that we call multistructured textual documents. Our work aims at proposing a method for a compact representation of multiple trees over a single text based on segmentation. Segmentation encoding allows querying overlap/containment relations of markups belonging to different structures. This paper particularly focuses on the architecture of the XML environment implementing our proposals.