{"title":"An Examination of the Effectiveness of Social Tagging for Resource Discovery","authors":"D. Goh, C. S. Lee, A. Chua, K. Razikin","doi":"10.1109/INGS.2008.11","DOIUrl":"https://doi.org/10.1109/INGS.2008.11","url":null,"abstract":"Social tagging allows users to assign keywords (tags) to resources facilitating their future access by the tag creator, and possibly by other users. In terms of its support for resource discovery, social tagging has both proponents and critics. The goal of this paper investigates if tags are an effective means for helping users locate useful resources. Adopting techniques from text categorization, we downloaded Web pages and their associated tags from del.icio.us, and trained Support Vector Machine classifiers to determine if the documents could be assigned to their associated tags. Results from the classifiers in terms of precision, recall and F1 score were mixed, suggesting that that not all tags could be used by public users for resource discovery. Detailed analyses of our results revealed characteristics of effective and ineffective tags for resource discovery. From these, implications for social tagging systems are discussed.","PeriodicalId":356148,"journal":{"name":"2008 International Workshop on Information-Explosion and Next Generation Search","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125214426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Umaporn Supasitthimethee, Toshiyuki Shimizu, M. Yoshikawa, Kriengkrai Porkaew
{"title":"An Extension of LCA Based XML Keyword Search","authors":"Umaporn Supasitthimethee, Toshiyuki Shimizu, M. Yoshikawa, Kriengkrai Porkaew","doi":"10.1109/INGS.2008.8","DOIUrl":"https://doi.org/10.1109/INGS.2008.8","url":null,"abstract":"One of the most convenient ways to query XML data is a keyword search because it does not require any knowledge about XML structure and without the need to learn a new user interface. However, keyword search interface is very flexible. It is hard for a system to decide which node is likely to be chosen as a return node and how much information should be included in the result. To address this challenge, we propose an extension of LCA based XML keyword search. First, to determine a return node, we provide a query syntax that the users can tell the system which node they are really interested in. In case that the users do not explicitly specify return information, our system will automatically analyze and choose appropriate return nodes by inferring from user keywords. Second, to return a meaningful result, we investigate the problem of the return information in the LCA and the proximity search approaches. To this end, we introduce the Lowest Element Node (LEN) and define our simple rules without any requirement on the schema information such as DTD or XML Schema. Our experiment results indicate that our system not only infers the right return nodes but also generates compact and meaningful results.","PeriodicalId":356148,"journal":{"name":"2008 International Workshop on Information-Explosion and Next Generation Search","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131128873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chinese Web Infrastructure Building: Challenges and Our Roadmap","authors":"Weining Qian, Aoying Zhou","doi":"10.1109/INGS.2008.21","DOIUrl":"https://doi.org/10.1109/INGS.2008.21","url":null,"abstract":"With the development of World-Wide Web, storage and utilization of Web data has become a big challenge to data management community. Though many commercial and academic tools emerge, the structure, content, and user behavior of Chinese Web is not fully studied. We are working on building a Chinese Web Infrastructure for support of such research. In this paper, the challenges of building such a system is analyzed, and our technical roadmap is discussed.","PeriodicalId":356148,"journal":{"name":"2008 International Workshop on Information-Explosion and Next Generation Search","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125720555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finding RkNN Straightforwardly with Large Secondary Storage","authors":"Hanxiong Chen, Rongmao Shi, K. Furuse, N. Ohbo","doi":"10.1109/INGS.2008.12","DOIUrl":"https://doi.org/10.1109/INGS.2008.12","url":null,"abstract":"In this paper, we proposes an efficient algorithm for finding reverse k nearest neighbor (RkNN) search. Given a set V of objects and a query object q, a RkNN query returns a subset of V such that each element of the subset has q as its kNN member according to a certain similarity metric. Early methods pre-compute NN of each data objects and find RNN. Recent methods introduce index based on the mutual distance between two objects. Our method can find RkNN for any k straightforwardly with constant running cost. It can be applied to any RkNN searches whenever the mutual distance between objects can be figured out. It does not require the triangle inequality even. It is also based on pre-compute information, under the assumptions that secondary storage (hard disk drive) is cheap and the current computers are powerful enough so their spare power can be used to update data offline. We evaluate the efficiency and effectiveness of the proposed method.","PeriodicalId":356148,"journal":{"name":"2008 International Workshop on Information-Explosion and Next Generation Search","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127379328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visualizing Changes in Coordinate Terms over Time: An Example of Mining Repositories of Temporal Data through their Search Interfaces","authors":"H. Ohshima, A. Jatowt, S. Oyama, K. Tanaka","doi":"10.1109/INGS.2008.13","DOIUrl":"https://doi.org/10.1109/INGS.2008.13","url":null,"abstract":"Certain data repositories provide search functionality for temporally ordered data. News archive search or blog search are examples of search interfaces that allow issuing structured queries composed of arbitrary terms and selected time constraints for performing temporal search. However, extracting aggregated knowledge such as detecting the evolution of objects or their relationships through these interfaces is difficult for users. In this paper, we discuss the problem of knowledge extraction and agglomeration from repositories of temporal data. In particular, we propose a method for detecting and visualizing changes in coordinate terms over time based on a news archive.","PeriodicalId":356148,"journal":{"name":"2008 International Workshop on Information-Explosion and Next Generation Search","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126912523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extending Keyword Search to Metadata on Relational Databases","authors":"Jiajun Gu, H. Kitagawa","doi":"10.1109/INGS.2008.14","DOIUrl":"https://doi.org/10.1109/INGS.2008.14","url":null,"abstract":"Keyword search is familiar to general users as the most popular information retrieval method, especially for searching on the Web because of its user-friendly way. In recent years various approaches to free-form keyword search over RDBMS have been proposed. They can produce results across multiple tuples in different relations according to a query consisting of a set of keywords. However, they just consider keyword search for values in tuple instances. In fact users have requirements to search keywords which may be part of the metadata of the database such as names of relations or attributes. In this paper, we extend keyword search on relational database. We define a tuple with annotation as an extension concept of a conventional tuple. In addition we add proposed weight to tuples. The weight function also cares about metadata information. We implement the query processing scheme in RDBMS in order to prove the proposed approach.","PeriodicalId":356148,"journal":{"name":"2008 International Workshop on Information-Explosion and Next Generation Search","volume":"5105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123724979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"QueReSeek: Community-Based Web Navigation by Reverse Lookup of Search History","authors":"H. Tan, I. Ohmukai, Hideaki Takeda","doi":"10.1109/INGS.2008.18","DOIUrl":"https://doi.org/10.1109/INGS.2008.18","url":null,"abstract":"In this paper, we propose a system called QueReSeek that realizes Web navigation by using search queries in a community. Web navigation is realized as follows: when a user browsing some Web content, if the Web content is included in the list of results of past search by people in the community, query strings used in the search are shown to the user. To realize this navigation, the system collects queries to search engines and their results, and builds the search query-URL index. It shows relevant queries from the URL of Web content which is browsed by users based on this index. By looking up this database reversely, it can show related query strings to Web contents. Since the search queries in the community are keywords related to information and knowledge of interest within the community, this navigation reflects implicit knowledge in the community. It is useful especially for community members who are not proficient in search. Such users can learn search expertise by following search strings provided by the system. We implemented this proposed method in two ways. We could display relevant queries for approximately 20% of the browsed Web content in this experiment.","PeriodicalId":356148,"journal":{"name":"2008 International Workshop on Information-Explosion and Next Generation Search","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128972600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Statistical Learning in Web Search","authors":"Hang Li","doi":"10.1109/INGS.2008.10","DOIUrl":"https://doi.org/10.1109/INGS.2008.10","url":null,"abstract":"Search is becoming the major means for people to access the information on the Internet. According to a survey, 55% of web users use search engines every day. Web search engines are built with technologies mainly from two areas, namely, large-scale distributed computing and statistical learning. Statistical learning is useful because there are many uncertainties in crawling, indexing, ranking, and serving of Web search and the solutions have to be data-driven. In this talk, I will explain how statistical learning technologies are being used in web search. I will also introduce some of the statistical learning technologies for web search, which we have developed recently at MSRA. They include BrowseRrank, ranking refinement, query dependent ranking, and query refinement.","PeriodicalId":356148,"journal":{"name":"2008 International Workshop on Information-Explosion and Next Generation Search","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114312012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hypothesis and Verification Based Measurement of Information Literacy","authors":"A. Sumida, Y. Hara","doi":"10.1109/INGS.2008.17","DOIUrl":"https://doi.org/10.1109/INGS.2008.17","url":null,"abstract":"This paper reports the results of the research and the analyzing into the difference of personal information literacy in using Web search. In recent information explosion Era, a huge deal of information is ordered on Web, but there would be a large gap of Information Literacy. We researched this gap by means of \"information literacy test\" we had made. This paper reports the relationship to the Information Literacy with each subject's attribute. The result is that 30s women shows the highest performance. And we classified four types of information retrieval. Besides, this paper proposes the necessity to provide new type search engine for low information literacy layer.","PeriodicalId":356148,"journal":{"name":"2008 International Workshop on Information-Explosion and Next Generation Search","volume":"106 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114106979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zi Huang, Yijun Li, Jie Shao, Heng Tao Shen, Liping Wang, Danqing Zhang, Xiangmin Zhou, Xiaofang Zhou
{"title":"Content-Based Video Search: Is there a Need, and Is it Possible?","authors":"Zi Huang, Yijun Li, Jie Shao, Heng Tao Shen, Liping Wang, Danqing Zhang, Xiangmin Zhou, Xiaofang Zhou","doi":"10.1109/INGS.2008.16","DOIUrl":"https://doi.org/10.1109/INGS.2008.16","url":null,"abstract":"There is a large and rapidly increasing amount of video data on the Internet and in personal or organizational collections. Fast and accurate video search emerges to be an important issue. The need and main technical challenges for video retrieval are similar to those for the content-based image retrieval (CBIR) problem. Lack of meaningful and comprehensive text annotation means that an approach based on content similarity can be promising; and the differences between an often high-level search intention and the low-level features used in content-based search techniques suggest that content-based video retrieval (CBVR) may also suffer from \"semantic gap\" issues. In this paper, we analyze the problem of CBVR from related work in the literature as well as some current work in our team, focusing on the relationship between CBIR and CBVR, open yet well-defined research issues and practical applications of CBVR.","PeriodicalId":356148,"journal":{"name":"2008 International Workshop on Information-Explosion and Next Generation Search","volume":"251 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115013349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}