{"title":"DTD-Miner: a tool for mining DTD from XML documents","authors":"Chuang-Hue Moh, Ee-Peng Lim, W. Ng","doi":"10.1109/WECWIS.2000.853869","DOIUrl":null,"url":null,"abstract":"XML documents are semi-structured and the structure of the documents is embedded in the tags. Although XML documents can be accompanied by a document type definition (DTD) that defines the structure of the documents, the presence of a DTD is not mandatory. The difficulty in deriving the DTD for XML documents lies in the fact that DTDs are of a different syntax from XML and that prior knowledge of the structure of the documents is required. In this paper, we introduce DTD-Miner, an automatic structure mining tool for XML documents. Using a Web-based interface, the user is able to submit a set of similarly structured XML documents and the system automatically suggests a DTD. The user is also able to further refine the DTD generated to reduce the complexity by relaxing some the rules used in the system.","PeriodicalId":340737,"journal":{"name":"Proceedings Second International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems. WECWIS 2000","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"68","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Second International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems. WECWIS 2000","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WECWIS.2000.853869","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 68
Abstract
XML documents are semi-structured and the structure of the documents is embedded in the tags. Although XML documents can be accompanied by a document type definition (DTD) that defines the structure of the documents, the presence of a DTD is not mandatory. The difficulty in deriving the DTD for XML documents lies in the fact that DTDs are of a different syntax from XML and that prior knowledge of the structure of the documents is required. In this paper, we introduce DTD-Miner, an automatic structure mining tool for XML documents. Using a Web-based interface, the user is able to submit a set of similarly structured XML documents and the system automatically suggests a DTD. The user is also able to further refine the DTD generated to reduce the complexity by relaxing some the rules used in the system.