2014 IEEE 30th International Conference on Data Engineering Workshops最新文献

筛选
英文 中文
Reconciling malware labeling discrepancy via consensus learning 通过共识学习协调恶意软件标签差异
2014 IEEE 30th International Conference on Data Engineering Workshops Pub Date : 2014-05-19 DOI: 10.1109/ICDEW.2014.6818308
Ting Wang, Xin Hu, S. Meng, R. Sailer
{"title":"Reconciling malware labeling discrepancy via consensus learning","authors":"Ting Wang, Xin Hu, S. Meng, R. Sailer","doi":"10.1109/ICDEW.2014.6818308","DOIUrl":"https://doi.org/10.1109/ICDEW.2014.6818308","url":null,"abstract":"Anti-virus systems developed by different vendors often demonstrate strong discrepancy in the labels they assign to given malware, which significantly hinders threat intelligence sharing. The key challenge of addressing this discrepancy stems from the difficulty of re-standardizing already-in-use systems. In this paper we explore a non-intrusive alternative. We propose to leverage the correlation between the malware labels of different anti-virus systems to create a “consensus” classification system, through which different systems can share information without modifying their own labeling conventions. To this end, we present a novel classification integration framework Latin which exploits the correspondence between participating anti-virus systems as reflected in heterogeneous information at instance-instance, instance-class, and class-class levels. We provide results from extensive experimental studies using real datasets and concrete use cases to verify the efficacy of Latin in reconciling the malware labeling discrepancy.","PeriodicalId":302600,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering Workshops","volume":"60 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123230335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Balloon Fusion: SPARQL rewriting based on unified co-reference information 气球融合:基于统一的共同引用信息的SPARQL重写
2014 IEEE 30th International Conference on Data Engineering Workshops Pub Date : 2014-05-19 DOI: 10.1109/ICDEW.2014.6818335
K. Schlegel, F. Stegmaier, Sebastian P. Bayerl, M. Granitzer, H. Kosch
{"title":"Balloon Fusion: SPARQL rewriting based on unified co-reference information","authors":"K. Schlegel, F. Stegmaier, Sebastian P. Bayerl, M. Granitzer, H. Kosch","doi":"10.1109/ICDEW.2014.6818335","DOIUrl":"https://doi.org/10.1109/ICDEW.2014.6818335","url":null,"abstract":"While Linked Open Data showed enormous increase in volume, yet there is no single point of access for querying the over 200 SPARQL repositories. In this paper we present Balloon Fusion, a SPARQL 1.1 rewriting and query federation service build on crawling and consolidating co-reference relationships in over 100 reachable Linked Data SPARQL Endpoints. The results of this process are 17.6M co-reference statements that have been clustered to 8.4M distinct semantic entities and are now accessible as download for further analysis. The proposed SPARQL rewriting performs a substitution of all URI occurrences with their synonyms combined with an automatic endpoint selection based on URI origin for a comprehensive query federation. While we show the technical feasibility, we also critically reflect the current status of the Linked Open Data cloud: although it is huge in size, access via SPARQL Endpoints is complicated in most cases due to missing quality of service.","PeriodicalId":302600,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering Workshops","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115549774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
BIIIG: Enabling business intelligence with integrated instance graphs BIIIG:通过集成的实例图实现商业智能
2014 IEEE 30th International Conference on Data Engineering Workshops Pub Date : 2014-05-19 DOI: 10.1109/ICDEW.2014.6818294
André Petermann, Martin Junghanns, R. Müller, E. Rahm
{"title":"BIIIG: Enabling business intelligence with integrated instance graphs","authors":"André Petermann, Martin Junghanns, R. Müller, E. Rahm","doi":"10.1109/ICDEW.2014.6818294","DOIUrl":"https://doi.org/10.1109/ICDEW.2014.6818294","url":null,"abstract":"We propose a new graph-based framework for business intelligence called BIIIG supporting the flexible evaluation of relationships between data instances. It builds on the broad availability of interconnected objects in existing business information systems. Our approach extracts such interconnected data from multiple sources and integrates them into an integrated instance graph. To support specific analytic goals, we extract subgraphs from this integrated instance graph representing executed business activities with all their data traces and involved master data. We provide an overview of the BIIIG approach and describe its main steps. We also present initial results from an evaluation with real ERP data.","PeriodicalId":302600,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130544812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Trip similarity computation for context-aware travel recommendation exploiting geotagged photos 基于地理标记照片的情境感知旅行推荐的旅行相似性计算
2014 IEEE 30th International Conference on Data Engineering Workshops Pub Date : 2014-05-19 DOI: 10.1109/ICDEW.2014.6818350
Zhenxing Xu
{"title":"Trip similarity computation for context-aware travel recommendation exploiting geotagged photos","authors":"Zhenxing Xu","doi":"10.1109/ICDEW.2014.6818350","DOIUrl":"https://doi.org/10.1109/ICDEW.2014.6818350","url":null,"abstract":"The popularity of GPS-enabled digital cameras, smart phones, and photo sharing web sites, e.g. Flickr and Panoramio, has led to huge volumes of community-contributed geotagged photos (CCGPs) available on the Internet, which could be regarded as digital footprints of photo takers. In this paper, we propose a method to make context-aware and trip similarity based travel recommendations by mining CCGPs. We obtain user-specific travel preferences from the travel history of user in one city, and use these to recommend tourist locations in another city. The season and weather context are considered during the mining and the recommendation processes. The similarity of users is computed by the modified longest common subsequence and a user-location graph is built from their travel histories in one city, which is then exploited to make travel recommendations. Our method is evaluated on a Flickr dataset, which contains photos taken in four cities of China. Experimental results show the effectiveness of the proposed method.","PeriodicalId":302600,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering Workshops","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123389776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Bootstrapping Wikipedia to answer ambiguous person name queries 引导维基百科来回答模棱两可的人名查询
2014 IEEE 30th International Conference on Data Engineering Workshops Pub Date : 2014-05-19 DOI: 10.1109/ICDEW.2014.6818303
Toni Grütze, G. Kasneci, Zhe Zuo, Felix Naumann
{"title":"Bootstrapping Wikipedia to answer ambiguous person name queries","authors":"Toni Grütze, G. Kasneci, Zhe Zuo, Felix Naumann","doi":"10.1109/ICDEW.2014.6818303","DOIUrl":"https://doi.org/10.1109/ICDEW.2014.6818303","url":null,"abstract":"Some of the main ranking features of today's search engines reflect result popularity and are based on ranking models, such as PageRank, implicit feedback aggregation, and more. While such features yield satisfactory results for a wide range of queries, they aggravate the problem of search for ambiguous entities: Searching for a person yields satisfactory results only if the person in question is represented by a high-ranked Web page and all required information are contained in this page. Otherwise, the user has to either reformulate/refine the query or manually inspect low-ranked results to find the person in question. A possible approach to solve this problem is to cluster the results, so that each cluster represents one of the persons occurring in the answer set. However clustering search results has proven to be a difficult endeavor by itself, where the clusters are typically of moderate quality. A wealth of useful information about persons occurs in Web 2.0 platforms, such as Wikipedia, LinkedIn, Facebook, etc. Being human-generated, the information on these platforms is clean, focused, and already disambiguated. We show that when searching with ambiguous person names the information from Wikipedia can be bootstrapped to group the results according to the individuals occurring in them. We have evaluated our methods on a hand-labeled dataset of around 5,000 Web pages retrieved from Google queries on 50 ambiguous person names.","PeriodicalId":302600,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering Workshops","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133989306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Execution and optimization of continuous windowed aggregation queries 连续窗口聚合查询的执行和优化
2014 IEEE 30th International Conference on Data Engineering Workshops Pub Date : 2014-05-19 DOI: 10.1109/ICDEW.2014.6818345
Harold Lim, S. Babu
{"title":"Execution and optimization of continuous windowed aggregation queries","authors":"Harold Lim, S. Babu","doi":"10.1109/ICDEW.2014.6818345","DOIUrl":"https://doi.org/10.1109/ICDEW.2014.6818345","url":null,"abstract":"The desire of companies to analyze web-site activity data quickly in order to show personalized content and advertisements to users has led to renewed interest in continuous query processing. One important query class here is windowed aggregation which does time-based windowing followed by grouping and aggregation over a data stream. An example query may aggregate each user's activity over a recent one hour window, and update the result every five minutes. In this paper, we characterize the rich execution plan space for windowed aggregation queries. No such attempt has been made previously to the best of our knowledge. Our second contribution is in developing a cost-based optimizer to pick a good plan from this space for a given query. Finally, we show the effectiveness of the cost-based optimizer.","PeriodicalId":302600,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering Workshops","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124670975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Analysis and detection of low quality information in social networks 社交网络中低质量信息的分析与检测
2014 IEEE 30th International Conference on Data Engineering Workshops Pub Date : 2014-05-19 DOI: 10.1109/ICDEW.2014.6818354
De Wang
{"title":"Analysis and detection of low quality information in social networks","authors":"De Wang","doi":"10.1109/ICDEW.2014.6818354","DOIUrl":"https://doi.org/10.1109/ICDEW.2014.6818354","url":null,"abstract":"With social networks like Facebook, Twitter and Google+ attracting audiences of millions of users, they have been an important communication platform in daily life. This in turn attracts malicious users to the social networks as well, causing an increase in the incidence of low quality information. Low quality information such as spam and rumors is a nuisance to people and hinders them from consuming information that is pertinent to them or that they are looking for. Although individual social networks are capable of filtering a significant amount of low quality information they receive, they usually require large amounts of resources (e.g, personnel) and incur a delay before detecting new types of low quality information. Also the evolution of various low quality information posts lots of challenges to defensive techniques. My PhD thesis work focuses on the analysis and detection of low quality information in social networks. We introduce social spam analytics and detection framework SPADE across multiple social networks showing the efficiency and flexibility of cross-domain classification and associative classification. For evolutionary study of low quality information, we present the results on large-scale study on Web spam and email spam over a long period of time. Furthermore, we provide activity-based detection approaches to filter out low quality information in social networks: click traffic analysis of short URL spam, behavior analysis of URL spam and information diffusion analysis of rumor. Our framework and detection techniques show promising results in analyzing and detecting low quality information in social networks.","PeriodicalId":302600,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering Workshops","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133367849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
A hashtags dictionary from crowdsourced definitions 一个来自众包定义的标签词典
2014 IEEE 30th International Conference on Data Engineering Workshops Pub Date : 2014-05-19 DOI: 10.1109/ICDEW.2014.6818300
Mérième Ghenname, Julien Subercaze, C. Gravier, F. Laforest, Mounia Abik, R. Ajhoun
{"title":"A hashtags dictionary from crowdsourced definitions","authors":"Mérième Ghenname, Julien Subercaze, C. Gravier, F. Laforest, Mounia Abik, R. Ajhoun","doi":"10.1109/ICDEW.2014.6818300","DOIUrl":"https://doi.org/10.1109/ICDEW.2014.6818300","url":null,"abstract":"Hashtags are user-defined terms used on the Web to tag messages like microposts, as featured on Twitter. Because a hashtag is a textual word, its representation does not convey all the concepts it embodies. Several online dictionaries have been manually and collaboratively built to provide natural language definitions of hashtags. Unfortunately, these dictionaries in their rough form are inefficient for their inclusion in automatic text processing systems. As hashtags can be polysemic, dictionaries are also agnostic to collision of hashtags. This paper presents our approach for the automatic structuration of hashtags definitions into synonym rings. We present the output as a so-called folksionary, i.e. a single integrated dictionary built from everybody's definitions. For this purpose, we achieved a semantic-relatedness clustering to group definitions that share the same meaning.","PeriodicalId":302600,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131237138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Semantic management of Enterprise Integration Patterns: A use case in Smart Grids 企业集成模式的语义管理:智能电网中的一个用例
2014 IEEE 30th International Conference on Data Engineering Workshops Pub Date : 2014-05-19 DOI: 10.1109/ICDEW.2014.6818302
O. Patri, A. Panangadan, V. Sorathia, V. Prasanna
{"title":"Semantic management of Enterprise Integration Patterns: A use case in Smart Grids","authors":"O. Patri, A. Panangadan, V. Sorathia, V. Prasanna","doi":"10.1109/ICDEW.2014.6818302","DOIUrl":"https://doi.org/10.1109/ICDEW.2014.6818302","url":null,"abstract":"Enterprise Integration Patterns are a set of design patterns for linking multiple systems using asynchronous messaging interfaces. This approach to system integration is increasingly popular due to its relatively simple loose coupling requirement. Implementations of these patterns are available in current integration frameworks but these are not semantic in nature. This paper introduces the concept of automatic management of messaging resources in an integration application via the use of a semantic representation of the Enterprise Integration Patterns. We have developed semantic representations of some of the commonly used integration patterns, which include a description of the expected resource requirements for each pattern. We then demonstrate this approach by considering the design of an application to connect mobile customers to Smart Power Grid companies (for the purpose of near real-time regulation of electricity usage). We illustrate potential savings in messaging resources and automatic lifecycle management using real-world sensor data collected in a Smart Grid project.","PeriodicalId":302600,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129304323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Neighbor-base similarity matching for graphs 图的基于邻居的相似度匹配
2014 IEEE 30th International Conference on Data Engineering Workshops Pub Date : 2014-05-19 DOI: 10.1109/ICDEW.2014.6818326
Hang Zhang, Hongzhi Wang, Jianzhong Li, Hong Gao
{"title":"Neighbor-base similarity matching for graphs","authors":"Hang Zhang, Hongzhi Wang, Jianzhong Li, Hong Gao","doi":"10.1109/ICDEW.2014.6818326","DOIUrl":"https://doi.org/10.1109/ICDEW.2014.6818326","url":null,"abstract":"The rapid development of internet and data centers has made cloud data management a major issue in database management system. Various cloud data management related applications require the basic operation of graph pattern matching. Exact matching method for graph pattern matching is too restrictive and it incurs very high computational cost as an NP-complete problem. Thus, it cannot be applied to most cloud applications. So several approximate notions are proposed. However, traditional approximate matching methods are still too restrictive in some situations, and some of them may neglect important nodes in the pattern. To address these problems, we propose a novel notion for graph pattern matching, and show that it can be processed in polynomial time. In addition, our method is flexible, free of thresholds and does not leave out any node in the pattern.","PeriodicalId":302600,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering Workshops","volume":"125 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117340896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信