KEYS '12Pub Date : 2012-05-20DOI: 10.1145/2254736.2254739
Yufei Tao
{"title":"Theoretical results on keyword search and related problems","authors":"Yufei Tao","doi":"10.1145/2254736.2254739","DOIUrl":"https://doi.org/10.1145/2254736.2254739","url":null,"abstract":"Keyword search has been extensively studied in the database area in the past decade. Indeed, scientists in this community have produced highly efficient systems to support keyword search in various applications, e.g., those on relational data, XML documents, spatial objects, to name only a few. Just like every other science subject, however, system work is no substitute for theoretical studies that aim at understanding the nature and hardness of keyword search, as well as their inter-connections to other relevant problems. In this talk, we will discuss theoretical results on several keyword search problems. As we will see, not surprisingly, progress in the theory field has lagged significantly behind the system frontier. The talk will also present some interesting open problems.","PeriodicalId":170987,"journal":{"name":"KEYS '12","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121160789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
KEYS '12Pub Date : 2012-05-20DOI: 10.1145/2254736.2254742
Rajvardhan Patil, Zhengxin Chen
{"title":"STRUCT: incorporating contextual information for English query search on relational databases","authors":"Rajvardhan Patil, Zhengxin Chen","doi":"10.1145/2254736.2254742","DOIUrl":"https://doi.org/10.1145/2254736.2254742","url":null,"abstract":"Research on keyword search in database community has achieved a lot of success, and areas of interests have been moved from keyword search in relational databases to various advanced issues such as keyword search in multimedia data and data streams. Yet, many fundamental issues on keyword search in traditional databases remain. One such issue is how to interpret users' information needs behind keywords they provided. A common approach of many prototype systems is to make such interpretation as a designer's choice (such as imposing AND or OR semantics, or a combination), leaving no choice to the users. A much more meaningful approach would be allowing users themselves to specify the required semantics through contextual information. So can we build a system which stays with the simplicity of Keyword search, yet can incorporate the contextual information provided in the user query? In this paper we introduce STRUCT to explore this idea. STRUCT takes English language queries involving intended keywords; we refer to such search as English query search. Instead of resorting on a full-fledged natural language processing, the unneeded words in the queries are discarded. Only the specific contextual information along with the keywords containing database contents will be used to construct SQL queries. The contextual information is used to interpret the meaning of the queries, including the semantics involving AND, OR and NOT. In this paper we describe the architecture of STRUCT, the procedure of English query processing (parsing), the basic idea of the grouping algorithm, SQL query construction and sample results of experiments.","PeriodicalId":170987,"journal":{"name":"KEYS '12","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115939656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
KEYS '12Pub Date : 2012-05-20DOI: 10.1145/2254736.2254746
H. Azzam, Sirvan Yahyaei, Marco Bonzanini, T. Roelleke
{"title":"A schema-driven approach for knowledge-oriented retrieval and query formulation","authors":"H. Azzam, Sirvan Yahyaei, Marco Bonzanini, T. Roelleke","doi":"10.1145/2254736.2254746","DOIUrl":"https://doi.org/10.1145/2254736.2254746","url":null,"abstract":"In order to search across factual knowledge and content explicated using different data formats this paper leverages a generic data model (schema) that transforms keyword-based retrieval models and queries to knowledge-oriented models and semantically-expressive queries. As each of the transformed retrieval models capitalises on a specific evidence space (term, classification, relationship and attribute), we demonstrate two possible combinations of these spaces, namely macro-based or micro-based. For bare keyword-based queries we demonstrate how the data model can be used to augment the queries with classifications, relationships, etc. that reflect the underlying constraints and objects found in the heterogeneous knowledge bases. Using the IMDb benchmark the results demonstrate the feasibility and effectiveness of the instantiated retrieval models and the query reformulation process.","PeriodicalId":170987,"journal":{"name":"KEYS '12","volume":"26 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130601442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
KEYS '12Pub Date : 2012-05-20DOI: 10.1145/2254736.2254749
Aggeliki Dimitriou, D. Theodoratos
{"title":"Efficient keyword search on large tree structured datasets","authors":"Aggeliki Dimitriou, D. Theodoratos","doi":"10.1145/2254736.2254749","DOIUrl":"https://doi.org/10.1145/2254736.2254749","url":null,"abstract":"Keyword search is the most popular paradigm for querying XML data on the web. In this context, three challenging problems are (a) to avoid missing useful results in the answer set, (b) to rank the results with respect to some relevance criterion and (c) to design algorithms that can efficiently compute the results on large datasets.\u0000 In this paper, we present a novel multi-stack based algorithm that returns as an answer to a keyword query all the results ranked on their size. Our algorithm exploits a lattice of stacks each corresponding to a partition of the keyword set of the query. This feature empowers a linear time performance on the size of the input data for a given number of query keywords. As a result, our algorithm can run efficiently on large input data for several keywords. We also present a variation of our algorithm which accounts for infrequent keywords in the query and show that it can significantly improve the execution time. An extensive experimental evaluation of our approach confirms the theoretical analysis, and shows that it scales smoothly when the size of the input data and the number of input keywords increases.","PeriodicalId":170987,"journal":{"name":"KEYS '12","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116538449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
KEYS '12Pub Date : 2012-05-20DOI: 10.1145/2254736.2254741
Zhong Zeng, Z. Bao, T. Ling, M. Lee
{"title":"iSearch: an interpretation based framework for keyword search in relational databases","authors":"Zhong Zeng, Z. Bao, T. Ling, M. Lee","doi":"10.1145/2254736.2254741","DOIUrl":"https://doi.org/10.1145/2254736.2254741","url":null,"abstract":"Keyword search has become an effective information retrieval method for structured data. Existing works in relational database keyword search have addressed the problems of finding and evaluating candidate results. However, given that keyword queries are inherently ambiguous, it is often the case that candidate results do not match users' search intention. In this paper, we analyze the limitations of current keyword search techniques and introduce the problem of generating and ranking keyword query interpretations. We propose a novel 3-phase keyword search paradigm which consists of: (1) the ability to predict query interpretations; (2) incorporate user feedback to to remove keyword ambiguities; (3) a ranking model to evaluate a query interpretation.","PeriodicalId":170987,"journal":{"name":"KEYS '12","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126042967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
KEYS '12Pub Date : 2012-05-20DOI: 10.1145/2254736.2254744
Yong Zeng, Z. Bao, T. Ling, Luochen Li
{"title":"MALEX: a MAp-like exploration model on XML database","authors":"Yong Zeng, Z. Bao, T. Ling, Luochen Li","doi":"10.1145/2254736.2254744","DOIUrl":"https://doi.org/10.1145/2254736.2254744","url":null,"abstract":"Keyword search on XML data has been a hot research issue recently. Towards the ultimate goal of retrieving results that match user's search intention, existing methods keep improving the matching semantics and result retrieval methods. A list of query results in the form of subtrees will be returned to the users. However, we find that the traditional way of returning a list of subtrees to the users is still not sufficient to meet users' information needs because: (1) search intention of a keyword query can be different with different users issuing it; (2) amongst the query results, they may have sibling and containment relationships, which could be important for digesting the query results and should be displayed to the users as well. To address the problem, we propose a new exploration model MALEX to work as a complementary component of the XML keyword search engine, in order to enhance users' search experience. It can even work independently as a new way to explore the XML database.","PeriodicalId":170987,"journal":{"name":"KEYS '12","volume":"71 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114089933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
KEYS '12Pub Date : 2012-05-20DOI: 10.1145/2254736.2254747
Carlos Garcia-Alvarado, C. Ordonez
{"title":"Integrating and querying source code of programs working on a database","authors":"Carlos Garcia-Alvarado, C. Ordonez","doi":"10.1145/2254736.2254747","DOIUrl":"https://doi.org/10.1145/2254736.2254747","url":null,"abstract":"Programs and a database's schema contain complex data and control dependencies that make modifying the schema along with multiple portions of the source code difficult to change. In this paper, we address the problem of exploring and analyzing those dependencies that exist between a program and a database's schema using keyword search techniques inside a database management system (DBMS). As a result, we present QDPC, a novel system that allows the integration and flexible querying within a DBMS of source code and a database's schema. The integration focuses on obtaining the approximate matches that exist between source files (classes, function and variable names) and the database's schema (table names and column names), and then storing them in summarization tables inside a DBMS. These summarization tables are then analyzed with SQL queries to find matches that are related to a set of keywords provided by the user. It is possible to perform additional analysis of the discovered matches by computing aggregations over the obtained matches, and to perform sophisticated analysis by computing OLAP cubes. In our experiments, we show that we obtain an efficient integration and allow complex analysis of the dependencies inside the DBMS. Furthermore, we show that searching for data dependencies and building OLAP cubes can be obtained in an efficient manner. Our system opens up the possibility of using the keyword search for software engineering applications.","PeriodicalId":170987,"journal":{"name":"KEYS '12","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126329414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
KEYS '12Pub Date : 2012-05-20DOI: 10.1145/2254736.2254738
Cong Yu
{"title":"Towards a high quality and web-scalable table search engine","authors":"Cong Yu","doi":"10.1145/2254736.2254738","DOIUrl":"https://doi.org/10.1145/2254736.2254738","url":null,"abstract":"For over a decade, a large number of studies have explored efficient mechanisms for finding relevant information from structured sources such as relational, semi-structured, and graph databases. While successful in its own right, keyword search over structured data has yet to gain wide spread adoption on the Web, and it is not because of the lack of structured data on the Web. In fact, the Web offers orders of magnitude more structured data than any offline data source: 14 billions tables can be gathered just by considering page content between the table tags. The main reason is that keyword search over structured data on the Web presents a unique set of challenges that are quite different from its non-Web counterparts, as well as different from searching over documents (where the search engines have excelled). In this talk, I will discuss those challenges and our approaches in addressing them at Google's WebTables project. In particular, I will present the table search engine, our initial effort toward building a high quality and scalable structured data search engine, along with our other efforts on managing structured data on the Web.","PeriodicalId":170987,"journal":{"name":"KEYS '12","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131141369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
KEYS '12Pub Date : 2012-05-20DOI: 10.1145/2254736.2254748
Sina Fakhraee, F. Fotouhi
{"title":"DBSemSXplorer: semantic-based keyword search system over relational databases for knowledge discovery","authors":"Sina Fakhraee, F. Fotouhi","doi":"10.1145/2254736.2254748","DOIUrl":"https://doi.org/10.1145/2254736.2254748","url":null,"abstract":"Keyword search over relational databases has been broadly studied in recent years. Research works have been done to address both the efficiency and the effectiveness of the keyword search over relational databases. One issue with keyword search in general is its ambiguity which can ultimately impact the effectiveness of the search in terms of the quality of the search results. This ambiguity is primarily due to the ambiguity of the contextual meaning of each term in the query (e.g. each query term can be mapped to different schema terms with the same name or their synonyms). In addition to the query ambiguity itself, the relationships between the keywords in the search results are crucial for the proper interpretation of the search results by the user and should be clearly presented in the search results.\u0000 To address these issues we have designed and implemented a prototype system DBSemSXplorer which can answer the traditional keyword search over relational databases in a more effective way with a better presentation of search results. We address the keyword search ambiguity issue by adapting some of the existing approaches for keyword mapping from the query terms to the schema terms/instances. The approaches we have adapted for term mapping capture both the syntactic similarity between the query keywords and the schema terms as well as the semantic similarity (e.g. definition of the keywords) of the two and give better mappings and ultimately more accurate results. Finally, to address the last issue of lacking clear relationships among the terms appearing in the search results, our system has leveraged semantic web technologies in order to enrich the knowledgebase and to discover the relationships between the keywords.\u0000 Our experiments show that our system is more effective than the traditional keyword search approaches by enabling the users to find the search results which are more relevant to their keyword queries.","PeriodicalId":170987,"journal":{"name":"KEYS '12","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131714910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}