{"title":"Thai herb information extraction from multiple websites","authors":"P. Chainapaporn, P. Netisopakul","doi":"10.1109/KST.2012.6287734","DOIUrl":null,"url":null,"abstract":"Thai herbs have increasingly gained public attention. Recently, there are a number of Thai herb websites. Each website has similar information but quite different details. For example, some webpages do not provide information indicating which part of Thai herb can treat the specified symptom. In order to collect more complete Thai herb information, we have developed information extraction process to extract Thai herb information from multiple websites. The process employed a HTML parser and file templates to recognize useful information in various webpage formats. Preliminary experiments gave satisfactory precision and recall over 85 percent.","PeriodicalId":209504,"journal":{"name":"Knowledge and Smart Technology (KST)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2012-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge and Smart Technology (KST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KST.2012.6287734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Thai herbs have increasingly gained public attention. Recently, there are a number of Thai herb websites. Each website has similar information but quite different details. For example, some webpages do not provide information indicating which part of Thai herb can treat the specified symptom. In order to collect more complete Thai herb information, we have developed information extraction process to extract Thai herb information from multiple websites. The process employed a HTML parser and file templates to recognize useful information in various webpage formats. Preliminary experiments gave satisfactory precision and recall over 85 percent.