Jure Mijiü, Marko Tadiü, Matija Janþec, Goran Jovanov
{"title":"HTML to XML conversion for non-programmers","authors":"Jure Mijiü, Marko Tadiü, Matija Janþec, Goran Jovanov","doi":"10.1109/ITI.2005.1491147","DOIUrl":null,"url":null,"abstract":"Any type of processing of the increas-ing number of e-text documents appearing today, particularly on the Internet, requires their con-version to a standard format like XML since they usually appear in a variety of proprietary and public formats. We present our solution for a generic HTML to XML conversion. The conversion is done using simple rules specified by the user. By defining these rules the user can divide the document into logical divisions (i.e. heading, body, signature) and achieve the desired output document structure. Our solution requires no programming skills because the script for the conversion is built interactively through a graphical user interface (GUI) and is suitable for all types of users","PeriodicalId":392003,"journal":{"name":"27th International Conference on Information Technology Interfaces, 2005.","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"27th International Conference on Information Technology Interfaces, 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITI.2005.1491147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Any type of processing of the increas-ing number of e-text documents appearing today, particularly on the Internet, requires their con-version to a standard format like XML since they usually appear in a variety of proprietary and public formats. We present our solution for a generic HTML to XML conversion. The conversion is done using simple rules specified by the user. By defining these rules the user can divide the document into logical divisions (i.e. heading, body, signature) and achieve the desired output document structure. Our solution requires no programming skills because the script for the conversion is built interactively through a graphical user interface (GUI) and is suitable for all types of users