{"title":"深度网络查询接口的自动分类研究","authors":"Peiguang Lin, Y. Du, Xiaohua Tan, Chao Lv","doi":"10.1109/ISIP.2008.140","DOIUrl":null,"url":null,"abstract":"In recent years, the Web is \"deepened\" rapidly and users have to browse quantities of Web sites to access Web databases in a specific domain. So, to build an unified query interface which integrates query interfaces of a domain to access various Web databases at the same time becomes a very important issue. In this paper, the schema characteristics of query interfaces and common attributes in a same domain are firstly analyzed, and it also gives a new representation of query interface, then the definition of \"Form term\" and \"Function term\" are proposed ,and a new similarity computing algorithm, literal and semantic based similarity computing (LSSC) is proposed, which is based on the two definitions. Secondly, a clustering algorithm for Deep Web query interfaces is given by combining LSSC and NQ algorithm: LSSC-NQ. Finally, experiments show that this algorithm can give accurate similarity computing, and cluster query interfaces efficiently, reliably and quickly.","PeriodicalId":103284,"journal":{"name":"2008 International Symposiums on Information Processing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Research on Automatic Classification for Deep Web Query Interfaces\",\"authors\":\"Peiguang Lin, Y. Du, Xiaohua Tan, Chao Lv\",\"doi\":\"10.1109/ISIP.2008.140\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the Web is \\\"deepened\\\" rapidly and users have to browse quantities of Web sites to access Web databases in a specific domain. So, to build an unified query interface which integrates query interfaces of a domain to access various Web databases at the same time becomes a very important issue. In this paper, the schema characteristics of query interfaces and common attributes in a same domain are firstly analyzed, and it also gives a new representation of query interface, then the definition of \\\"Form term\\\" and \\\"Function term\\\" are proposed ,and a new similarity computing algorithm, literal and semantic based similarity computing (LSSC) is proposed, which is based on the two definitions. Secondly, a clustering algorithm for Deep Web query interfaces is given by combining LSSC and NQ algorithm: LSSC-NQ. Finally, experiments show that this algorithm can give accurate similarity computing, and cluster query interfaces efficiently, reliably and quickly.\",\"PeriodicalId\":103284,\"journal\":{\"name\":\"2008 International Symposiums on Information Processing\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 International Symposiums on Information Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISIP.2008.140\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Symposiums on Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISIP.2008.140","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on Automatic Classification for Deep Web Query Interfaces
In recent years, the Web is "deepened" rapidly and users have to browse quantities of Web sites to access Web databases in a specific domain. So, to build an unified query interface which integrates query interfaces of a domain to access various Web databases at the same time becomes a very important issue. In this paper, the schema characteristics of query interfaces and common attributes in a same domain are firstly analyzed, and it also gives a new representation of query interface, then the definition of "Form term" and "Function term" are proposed ,and a new similarity computing algorithm, literal and semantic based similarity computing (LSSC) is proposed, which is based on the two definitions. Secondly, a clustering algorithm for Deep Web query interfaces is given by combining LSSC and NQ algorithm: LSSC-NQ. Finally, experiments show that this algorithm can give accurate similarity computing, and cluster query interfaces efficiently, reliably and quickly.