{"title":"Personalizing a Web site for cellular phones","authors":"A. Kobayashi, H. Fujioka","doi":"10.1109/WI.2003.1241233","DOIUrl":"https://doi.org/10.1109/WI.2003.1241233","url":null,"abstract":"We propose a new methodology of personalizing a Web site with access control, to be adapted for cellular phones. We realize it by following three steps. First, we classify all Web pages into either link pages, which are intended to link to another pages, or data pages, which is intended to offer service. We grant access privileges to particular data pages. Second, we eliminate redundant links by calculating the shortest path from a home page to the data pages. Third, we remove the waste link pages from the Web site, by merging link pages based on the similarity between them. The resulting personalized Web site makes it easier for a user to access the data pages offering his requested service.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114406267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An automated management tool for unstructured data","authors":"M. Ceglowski, A. Coburn, J. Cuadrado","doi":"10.1109/WI.2003.1241266","DOIUrl":"https://doi.org/10.1109/WI.2003.1241266","url":null,"abstract":"The rapidly growing quantity of online data has created a need for automated, content-based categorization and search tools. We describe an open-source, Web-based archive management, which uses latent semantic indexing, coupled with vector clustering techniques, to provide users with a fully searchable and automatically categorized interface to a data collection. The default English document parser included in the project uses part-of-speech tagging and recursive maximal noun phrase extraction to create a more effective term list for LSI than traditional stop list techniques. The archive interface supports multiple user views of the data collection. Advanced search features are implemented through relevance feedback, and do not require users to learn a query syntax.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114848597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"WebScout: support for revisitation of Web pages within a navigation session","authors":"Natasa Milic-Frayling, Ralph Sommerer, K. Rodden","doi":"10.1109/WI.2003.1241297","DOIUrl":"https://doi.org/10.1109/WI.2003.1241297","url":null,"abstract":"WebScout is a system that creates a personal archive of Web pages seen by the user and a rich record of the user's navigation, including various types of user and system generated annotations. We explore how this rich archive can be used to provide support for user navigation, in particular, for revisitation of pages within a navigation session. We describe the WebScout SessionNavigator feature that enhances the current browser functionality by providing both sequential and graph representation of the user navigation. It introduces the concept of a WebTrail which designates a sequence of navigation steps, started by a particular event, such as search, or explicit specification of a URL by typing into the address bar, or executing a link from a bookmark list. We present details of a user study that explores how users perceive and remember their navigation on the Web.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124468921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using psychological word database in Web search","authors":"H. Takeuchi, M. Kitajima, Haruhiko Urokobara","doi":"10.1109/WI.2003.1241243","DOIUrl":"https://doi.org/10.1109/WI.2003.1241243","url":null,"abstract":"We propose a new approach for indexing Web contents. The essence of our approach is that we use a psycho-linguistic word database for calculating the index of Web pages. Since there is no existing database designed for this purpose, we started our study by creating a word database. We will show that the database can be effectively used for estimating the reading levels of specific Web pages. We will also show that this approach can be used to reflect user profiles in Web searches.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124630564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A neural network based approach to automated e-mail classification","authors":"J. Clark, I. Koprinska, Josiah Poon","doi":"10.1109/WI.2003.1241300","DOIUrl":"https://doi.org/10.1109/WI.2003.1241300","url":null,"abstract":"We present a neural network based system for automated e-mail filing into folders and anti-spam filtering. The experiments show that it is more accurate than several other techniques. We also investigate the effects of various feature selection, weighting and normalization methods, and also the portability of the anti-spam filter across different users.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129379484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Web based collection selection using singular value decomposition","authors":"Johnny King, Yuefeng Li","doi":"10.1109/WI.2003.1241180","DOIUrl":"https://doi.org/10.1109/WI.2003.1241180","url":null,"abstract":"As the number of electronic data collections available on the Internet increases, so does the difficulty of finding the right collection for a given query. Often the first time user is overwhelmed by the array of options available, and wastes time hunting through pages of collection names, followed by time reading results pages after doing an ad-hoc search. Automatic collection selection methods try to solve this problem by suggesting the best subset of collections to search based on a query. This is of importance to fields containing large number of electronic collections, which undergo frequent change, and collections that cannot be fully indexed using traditional methods such as spiders. We present a solution to this problem of selecting the best collections and reducing the number of collections needing to be searched. Preliminary tests of the system, conducted on Web search engines, suggest that this solves much of the Web based Collection selection problem.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129293662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Web-based information retrieval support systems: building research tools for scientists in the new information age","authors":"Jingtao Yao, Yiyu Yao","doi":"10.1109/WI.2003.1241270","DOIUrl":"https://doi.org/10.1109/WI.2003.1241270","url":null,"abstract":"The concept of Web-based information retrieval support systems (WIRSS) is introduced. The needs for WIRSS are shown by a detailed case study of existing research article indexing and citation analysis systems, such as current content, DBLP, science citation index and CiteSeer. The objective of WIRSS is to build new and effective research tools for scientists to access, explore and use information on the Web, which may lead to improved research productivity and quality.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130791168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Querying and clustering Web pages about persons and organizations","authors":"Shiren Ye, Tat-Seng Chua, Jeremy R. Kei","doi":"10.1109/WI.2003.1241214","DOIUrl":"https://doi.org/10.1109/WI.2003.1241214","url":null,"abstract":"One of the most frequent Web surfing tasks is to search for names of persons and organizations. Such names are often not distinctive, commonly occurring, and nonunique. Thus, a single name may be mapped to several entities. We describe a methodology to cluster the Web pages returned by the search engine so that pages belonging to different entities are clustered into different groups. The algorithm uses a combination of named entities, link-based and structure-based information as features to partition the document set into direct and indirect pages using a decision model. It then uses the distinct direct pages as seeds to cluster the document set into different clusters. The algorithm has been found to be effective for Web-based applications.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130805163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two-phase Web site classification based on hidden Markov tree models","authors":"Yonghong Tian, Tiejun Huang, Wen Gao","doi":"10.1109/WI.2003.1241198","DOIUrl":"https://doi.org/10.1109/WI.2003.1241198","url":null,"abstract":"With the exponential growth of both the amount and diversity of the information that the Web encompasses, automatic classification of topic-specific Web sites is highly desirable. We propose a novel approach for Web site classification based on the content, structure and context information of Web sites. In our approach, the site structure is represented as a two-layered tree in which each page is modeled as a DOM (document object model) tree and a site tree is used to hierarchically link all pages within the site. Two context models are presented to capture the topic dependences in the site. Then the hidden Markov tree (HMT) model is utilized as the statistical model of the site tree and the DOM tree, and an HMT-based classifier is presented for their classification. Moreover, for reducing the download size of Web sites but still keeping high classification accuracy, an entropy-based approach is introduced to dynamically prune the site trees. On these bases, we employ the two-phase classification system for classifying Web sites through a fine-to-coarse recursion. The experiments show our approach is able to offer high accuracy and efficient process performance.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130941974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Web-agents inspired by ethology: a population of \"ant\"-like agents to help finding user-oriented information","authors":"A. Revel","doi":"10.1109/WI.2003.1241245","DOIUrl":"https://doi.org/10.1109/WI.2003.1241245","url":null,"abstract":"We present a Web-search ant-agent inspired by ethology and robotics. We detail its implementation on a set of FIPAOS platforms and show useful results in route finding and rerouting. Finally, we discuss its interest and drawbacks in comparison with classical search engines and give perspectives to overcome them.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126589707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}