{"title":"An EM based training algorithm for cross-language text categorization","authors":"Leonardo Rigutini, Marco Maggini, B. Liu","doi":"10.1109/WI.2005.29","DOIUrl":"https://doi.org/10.1109/WI.2005.29","url":null,"abstract":"Due to the globalization on the Web, many companies and institutions need to efficiently organize and search repositories containing multilingual documents. The management of these heterogeneous text collections increases the costs significantly because experts of different languages are required to organize these collections. Cross-language text categorization can provide techniques to extend existing automatic classification systems in one language to new languages without requiring additional intervention of human experts. In this paper, we propose a learning algorithm based on the EM scheme which can be used to train text classifiers in a multilingual environment. In particular, in the proposed approach, we assume that a predefined category set and a collection of labeled training data is available for a given language L/sub 1/. A classifier for a different language L/sub 2/ is trained by translating the available labeled training set for L/sub 1/ to L/sub 2/ and by using an additional set of unlabeled documents from L/sub 2/. This technique allows us to extract correct statistical properties of the language L/sub 2/ which are not completely available in automatically translated examples, because of the different characteristics of language L/sub 1/ and of the approximation of the translation process. Our experimental results show that the performance of the proposed method is very promising when applied on a test document set extracted from newsgroups in English and Italian.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121081129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Managing ontology changes on the semantic Web","authors":"Delia Rogozan, G. Paquette","doi":"10.1109/WI.2005.92","DOIUrl":"https://doi.org/10.1109/WI.2005.92","url":null,"abstract":"Although the ontology evolution plays a key role in the semantic Web, methods and tools to support it are missing. Thus, this paper proposes a component-based framework for managing ontology changes. The main functionalities of the OntoAnalyzer framework are: (1) to track changes and to formalize them using a language that we propose for representing ontology changes and (2) to identify changes a posteriori to ontology evolution and to analyze their effect on the ontology-based annotation of resources.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121146849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards a tuplespace-based middleware for the semantic Web","authors":"R. Tolksdorf, L. Nixon, E. Simperl","doi":"10.1109/WI.2005.150","DOIUrl":"https://doi.org/10.1109/WI.2005.150","url":null,"abstract":"The realization of the semantic Web needs a set of specialized middleware as its infrastructure. In this paper we describe the principles of tuplespace computing, explain why tuplespaces are a suitable middleware for the semantic Web, envision \"semantic Web spaces\", and outline how our tuplespace platform XMLSpaces can be extended to support semantic Web technologies, like RDF(S) and OWL.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126077874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pseudo-relevance feedback in Web information retrieval using segments' subjective importance values","authors":"S. Y. Yoo, A. Hoffmann","doi":"10.1109/WI.2005.122","DOIUrl":"https://doi.org/10.1109/WI.2005.122","url":null,"abstract":"To make Web search more effective, we address the problem of articulating a user's information needs more effectively. This is done in an iterative way, by allowing the user to provide relevance feedback regarding individual segments of retrieved Web-pages. Previously applied methods are limited to discovering 'general importance values of segments' (based on the authors' 'objective views' i.e., main topics) rather than 'subjective importance values of segments' (based on a user's 'subjective view' i.e., personal information needs). In this paper, a user's interests are incrementally identified by allowing the user to iteratively select relevant keywords or phrases from a set of system-recommended candidate-keywords and candidate-phrases (i.e., pseudo-relevance feedback). It makes it possible to discover 'subjective importance values of segments' that can be dynamically changed by the user by indicating their interests regarding retrieved Web-pages. The important segments, selected by the user, provide higher precision of pseudo-relevance feedback for further Web information retrieval purposes.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116056190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Architecture for automated annotation and ontology based querying of semantic Web resources","authors":"Brooke Abrahams, W. Dai","doi":"10.1109/WI.2005.34","DOIUrl":"https://doi.org/10.1109/WI.2005.34","url":null,"abstract":"The semantic Web provides the foundation for semantic architecture to support the transparent exchange of information and knowledge among collaborating e-business organizations. Recent advances in semantic Web based technologies offer means for organizations to exchange knowledge in a meaningful way by R. Singh et al. (2005). In spite of these developments, major challenges remain for developers of semantic Web applications, such as the availability of semantic Web content by V.R. Benjamins et al. (2004), and ontology based information retrieval by A.Chebotko et al. (2004). In this paper, an architecture aimed at addressing these issues is presented. An easy to use annotation tool is deployed, providing a convenient mechanism for Web site owners to mark up their Web pages with RDF metadata. Search and coordination activities are carried out by a system of multiagents designed for such environments. The architecture is demonstrated in the accommodation services domain of the Australian Tourism industry.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114425565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A method of Web search result clustering based on rough sets","authors":"Chi Lang Ngo, H. Nguyen","doi":"10.1109/WI.2005.7","DOIUrl":"https://doi.org/10.1109/WI.2005.7","url":null,"abstract":"Due to the enormous size of the Web and low precision of user queries, finding the right information from the Web can be difficult if not impossible. One approach that tries to solve this problem is using clustering techniques for grouping similar document together in order to facilitate presentation of results in more compact form and enable thematic browsing of the results set. The main problem of many Web search result (snippet) clustering algorithm is based on the poor vector representation of snippets. In this paper, we present a method of snippet representation enrichment using tolerance rough set model. We applied the proposed method to construct a rough set based search result clustering algorithm and compared it with other recent methods.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117007995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Limitations of community Web portals: a classmates' case study","authors":"A. Fensel, D. Fensel","doi":"10.1109/WI.2005.91","DOIUrl":"https://doi.org/10.1109/WI.2005.91","url":null,"abstract":"We analyze typical Web portals supporting communication, data sharing and activities of former classmates. The inflexibility and restrictions imposed on users of such portals are demonstrated to support the thesis that introduction of community-driven ontology management is crucial for full-fledged satisfaction of the user needs on the semantic Web.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129442349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining emerging patterns and classification in data streams","authors":"Hamad Alhammady, K. Ramamohanarao","doi":"10.1109/WI.2005.96","DOIUrl":"https://doi.org/10.1109/WI.2005.96","url":null,"abstract":"A data stream model has been proposed recently for those data intensive applications such as financial applications, manufacturing, and others (Babcock et al., 2002). In this model, data arrives in multiple, continuous, rapid, time-varying data streams. These characteristics make it infeasible for traditional classification and mining techniques to deal with data streams. In this paper, we propose a novel method for mining emerging patterns (EPs) in data streams. Moreover, we show how these EPs can be used to classify data streams. EPs (Dong and Li, 1999) are those itemsets whose supports in one class are significantly higher than their supports in the other classes. The experimental evaluation shows that our proposed method can achieve up to 10% increase in accuracy compared to the other methods.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"213 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130331757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discovering and visualizing temporal-based Web access behavior","authors":"Baoyao Zhou, S. Hui, A. Fong","doi":"10.1109/WI.2005.55","DOIUrl":"https://doi.org/10.1109/WI.2005.55","url":null,"abstract":"Discovering and understanding Web users' surfing behavior are essential for the development of successful Web monitoring and recommendation systems. In this paper, we propose a Web usage mining approach for the automatic discovery and visualization of temporal-based Web access behavior of individual users by mining client-side logs. The proposed approach is based on a Web usage lattice model which represents a hierarchy of Web access activities. To describe such Web access activities, we incorporate fuzzy logic to represent real life temporal concepts such as morning, afternoon and evening, and meaningful Web categories such as news, sports and chat. Based on the lattice, temporal and association behavior patterns can be extracted and visualized.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126982110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Community-driven ontology management: DERI case study","authors":"A. Fensel, Reto Krummenacher, J. Henke, D. Fensel","doi":"10.1109/WI.2005.49","DOIUrl":"https://doi.org/10.1109/WI.2005.49","url":null,"abstract":"We introduce the concept of community-driven ontology management and demonstrate the added value to conventional ontology management of being community-driven. Further, we present an implementation of an infrastructure supporting community-driven ontology management. The implemented infrastructure was deployed as a part of the intranet at DERI - Digital Enterprise Research Institute, and the community's response and behavior were observed. The results obtained prove feasibility and advantages of community-driven ontology management.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123652003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}