BEWEB '11Pub Date : 2011-03-25DOI: 10.1145/1966883.1966891
Lisette García-Moya, Shahad Kudama, M. Cabo, Rafael Berlanga Llavori
{"title":"Integrating web feed opinions into a corporate data warehouse","authors":"Lisette García-Moya, Shahad Kudama, M. Cabo, Rafael Berlanga Llavori","doi":"10.1145/1966883.1966891","DOIUrl":"https://doi.org/10.1145/1966883.1966891","url":null,"abstract":"Web opinion feeds have become one of the most popular information sources users consult before buying products or contracting services. Negative opinions about some product can have a high impact in its sales figures. As a consequence, companies are more and more concerned about how to integrate this information in their Business Intelligence (BI) models so that they can predict sales figures or define new strategic goals. In this paper, we present an approach to integrate sentiment data extracted from web feeds into the corporate warehouse where company analytical data and models are stored. Such an integration allows users to perform new analysis tasks by using the traditional OLAP-based data warehouse operators. We have developed a case study over a set of real opinions about digital devices which are offered by a wholesaler company. Over this case study, the quality of the extracted sentiment data is evaluated, and some query examples that illustrate the potential uses of the integrated model are presented.","PeriodicalId":238578,"journal":{"name":"BEWEB '11","volume":"122 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132483242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BEWEB '11Pub Date : 2011-03-25DOI: 10.1145/1966883.1966890
Byung-Kwon Park, I. Song
{"title":"Toward total business intelligence incorporating structured and unstructured data","authors":"Byung-Kwon Park, I. Song","doi":"10.1145/1966883.1966890","DOIUrl":"https://doi.org/10.1145/1966883.1966890","url":null,"abstract":"As the amount of data grows very fast inside and outside of an enterprise, it is getting important to seamlessly analyze both of them for getting total business intelligence. The data can be classified into two categories: structured and unstructured. Especially, as most of valuable business information are encoded in the unstructured text documents including Web pages in Internet, we need a specialized Text OLAP solution to perform multi-dimensional analysis on text documents in the same way as on structured relational data. Since the technologies of text mining and information retrieval are major technologies handling text data, we first review the representative works selected for demonstrating how they can be applied for Text OLAP. And then, we survey the representative works selected for demonstrating how we can associate and consolidate both unstructured text documents and structured relation data for obtaining total business intelligence. Finally, we present an architecture for a total business intelligence platform incorporating structured and unstructured data. We expect the proposed architecture, which integrates information retrieval, text mining, and information extraction technologies all together as well as relational OLAP technologies, would make an effective platform toward total business intelligence.","PeriodicalId":238578,"journal":{"name":"BEWEB '11","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132130305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BEWEB '11Pub Date : 2011-03-25DOI: 10.1145/1966883.1966889
Alexander Löser, Christoph Nagel, Stephan Pieper, Christoph Boden
{"title":"Self-supervised web search for any-k complete tuples","authors":"Alexander Löser, Christoph Nagel, Stephan Pieper, Christoph Boden","doi":"10.1145/1966883.1966889","DOIUrl":"https://doi.org/10.1145/1966883.1966889","url":null,"abstract":"A common task of Web users is querying structured information from Web pages. In this paper we propose a novel query processor for systematically discovering any-k relations from Web search results with conjunctive queries. The 'any-k' phrase denotes that retrieved tuples are not ranked by the system.\u0000 For realizing this interesting scenario the query processor transfers a structured query into keyword queries that are submitted to a search engine, forwards search results to relation extractors, and then combines relations into result tuples.\u0000 Unfortunately, relation extractors may fail to return a relation for a result tuple. We propose a solid information theory-based approach for retrieving missing attribute values of partially retrieved relations. Moreover, user-defined data sources may not return at least k complete result tuples. To solve this problem, we extend the Eddy query processing mechanism [14] for our 'querying the Web' scenario with a continuous, adaptive routing model. The model determines the most promising next incomplete row for returning any-k complete result tuples at any point during the query execution process.\u0000 We report a thorough experimental evaluation over multiple relation extractors. Our experiments demonstrate that our query processor returns complete result tuples while processing only very few Web pages.","PeriodicalId":238578,"journal":{"name":"BEWEB '11","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125853858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BEWEB '11Pub Date : 2011-03-25DOI: 10.1145/1966883.1966893
Katia Vila, A. F. Rodríguez
{"title":"Model-driven restricted-domain adaptation of question answering systems for business intelligence","authors":"Katia Vila, A. F. Rodríguez","doi":"10.1145/1966883.1966893","DOIUrl":"https://doi.org/10.1145/1966883.1966893","url":null,"abstract":"Business Intelligence (BI) applications no longer limit their analysis to structured databases, but they also need to obtain actionable information from unstructured sources (e.g. data from the Web, etc.). Interestingly, Question Answering (QA) systems are good candidates for these purposes, since they allow users to obtain concise answers to questions stated in natural language from a collection of text documents. Traditionally, QA systems include patterns for dealing with a large spectrum of general questions, namely open-domain question answering (ODQA). However, BI users should be aware of asking questions related to a specific activity of the business (e.g. healthcare, agricultural, transportation, etc.). Therefore, adapting ODQA systems to new restricted domains is an increasingly necessity for these systems to be precisely used in BI. Unfortunately, research addressing this topic has two main drawbacks: (i) patterns are manually tuned, which requires a huge effort in time and cost, and (ii) tuning of patterns is based on analyzing potential questions to be answered, which is not a realistic situation since, in restricted domains, questions are highly complex and difficult to be acquired. To overcome these drawbacks, this paper presents a novel approach based on model-driven development in order to use knowledge resources to automatically and effortlessly adapt patterns of ODQA systems to be useful for restricted-domain BI scenarios.","PeriodicalId":238578,"journal":{"name":"BEWEB '11","volume":"160 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132739111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BEWEB '11Pub Date : 2011-03-25DOI: 10.1145/1966883.1966894
Emilian Pascalau
{"title":"Towards TomTom like systems for the web: a novel architecture for browser-based mashups","authors":"Emilian Pascalau","doi":"10.1145/1966883.1966894","DOIUrl":"https://doi.org/10.1145/1966883.1966894","url":null,"abstract":"Business Intelligence (BI) for the new economy requires people to take active part and do by themselves BI development tasks from within their browsers. With the great progress of Web 2.0 into the mainstream the perspective of the BI development has widened and a new set of characteristics drive it: new business models, changing customer relationship, software on demand, instant use, and deep architectural impact. Mashups that emerge from the cloud computing are available in various vendors' BI environments and allow users (although restricted by the features these systems support) to develop their BI tasks.\u0000 This paper introduces a new architecture for browser based mashups that inherits ideas from TomTom systems. This architecture is capable to address issues such as business intelligence on demand, instant use and to offer the same degree of generality as the browser.","PeriodicalId":238578,"journal":{"name":"BEWEB '11","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132721916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BEWEB '11Pub Date : 2011-03-25DOI: 10.1145/1966883.1966885
S. Amer-Yahia
{"title":"I am complex: cluster me, don't just rank me","authors":"S. Amer-Yahia","doi":"10.1145/1966883.1966885","DOIUrl":"https://doi.org/10.1145/1966883.1966885","url":null,"abstract":"A large number of online applications are built over high dimensional data. That is the case for shopping where products have several features (e.g., price and color), dating where personal profiles are described using several dimensions (e.g., physical features and political views), and entertainment (e.g., movie genre and director, restaurant ambiance and location). In addition, in some applications, items may be accompanied with qualitative data such as movie and restaurant reviews. The typical way users find items in those applications is by entering a keyword query and receiving a ranked list of relevant results. Ideally, just like in Web search, users would want to spend little time before finding a satisfactory item. In practice, due the query output size, the high dimensionality of items, and in some cases, the presence of qualitative data, users tend to spend a lot of time trying to understand correlations between item features and item quality. In this talk, I will argue that the 10-blue links experience we are used to in Web search, keywords as input - ranked list as output, is inappropriate when querying and ranking high dimensional data. I will describe two applications: exploring qualitative data and ranked querying of structured data.","PeriodicalId":238578,"journal":{"name":"BEWEB '11","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131300174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}