22nd International Conference on Data Engineering Workshops (ICDEW'06)最新文献

筛选
英文 中文
Category-based Functional Information Modeling for eChronicles 基于分类的编年史功能信息建模
22nd International Conference on Data Engineering Workshops (ICDEW'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.38
Pilho Kim, R. Jain
{"title":"Category-based Functional Information Modeling for eChronicles","authors":"Pilho Kim, R. Jain","doi":"10.1109/ICDEW.2006.38","DOIUrl":"https://doi.org/10.1109/ICDEW.2006.38","url":null,"abstract":"In this paper, a category-based information model is introduced for eChronicles. It features the use of an e-node to represent the identity of information and uses categorized relationships to represent the relations of grouped information sets while preserving their internal data set structures. Our approach separates a data set and its symbolic objects by introducing an e-node between them and merging those pairs through categorical transformation. Our model also supports a functional system representation using functors and natural transformation in category theory to handle complex information processing and to handle complex information processing and the relationships between functions in a canonical way. We demonstrate our theory by converting scattered heterogeneous information into structured data usable by eChronicles. In this paper our focus is on presenting the theoretical framework that we are developing to represent heterogeneous data in way that allows preservation of essential characteristics of data and the processes used to extract symbols from the data.","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123743080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clustering Multidimensional Trajectories based on Shape and Velocity 基于形状和速度的多维轨迹聚类
22nd International Conference on Data Engineering Workshops (ICDEW'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.39
Y. Yanagisawa, T. Satoh
{"title":"Clustering Multidimensional Trajectories based on Shape and Velocity","authors":"Y. Yanagisawa, T. Satoh","doi":"10.1109/ICDEW.2006.39","DOIUrl":"https://doi.org/10.1109/ICDEW.2006.39","url":null,"abstract":"Recently, the analysis of moving objects has become one of the most important technologies to be used in various applications such as GIS, navigation systems, and locationbased information systems, Existing geographic analysis approaches are based on points where each object is located at a certain time. These techniques can extract interesting motion patterns from each moving object, but they can not extract relative motion patterns from many moving objects. Therefore, to retrieve moving objects with a similar trajectory shape to another given moving object, we propose queries based on the similarity between the shapes of moving object trajectories. Our proposed technique can find trajectories whose shape is similar to a certain given trajectory. We define the shape-based similarity query trajectories as an extension of similarity queries for time series data, and then we propose a new clustering technique based on similarity by combining both velocities of moving objects and shapes of objects. Moreover, we show the effectiveness of our proposed clustering method through a performance study using moving rickshaw data.","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125681016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Estimating Top N Hosts in Cardinality Using Small Memory Resources 利用小内存资源估计基数排名前N的主机
22nd International Conference on Data Engineering Workshops (ICDEW'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.56
K. Ishibashi, Tatsuya Mori, R. Kawahara, Yutaka Hirokawa, A. Kobayashi, K. Yamamoto, H. Sakamoto
{"title":"Estimating Top N Hosts in Cardinality Using Small Memory Resources","authors":"K. Ishibashi, Tatsuya Mori, R. Kawahara, Yutaka Hirokawa, A. Kobayashi, K. Yamamoto, H. Sakamoto","doi":"10.1109/ICDEW.2006.56","DOIUrl":"https://doi.org/10.1109/ICDEW.2006.56","url":null,"abstract":"We propose a method to find N hosts that have the N highest cardinalities, where cardinality is the number of distinct items such as the number of flows, ports, or peer hosts. The method also estimates their cardinalities. While existing algorithms to find the top N frequent items can be directly applied to find N hosts that send the N largest numbers of packets through packet data stream, finding hosts that have the N highest cardinalities requires tables of previously seen items for each host to check whether an item of an arrival packet is new, which requires a lot of memory. Even if we use the existing cardinality estimation methods, we still need to have cardinality information about each host. In this paper, we use the property of cardinality estimation, in which the cardinality of intersections of multiple data sets can be estimated with cardinality information of each data set. Using the property, we propose an algorithm that does not need to maintain tables for each host, but only for partitioned addresses of a host and estimate the cardinality of a host as the intersection of cardinalities of partitioned addresses. We also propose a method to find top N hosts in cardinalities which is to be monitored to detect anomalous behavior in networks. We evaluate our algorithm through actual backbone traffic data. While the estimation accuracy of our scheme degrades for small cardinalities, as for the top 100 hosts, the accuracy of our algorithm with 4, 096 tables is almost the same as having tables of every hosts.","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128138447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Mining Executive Compensation Data from SEC Filings 从SEC文件中挖掘高管薪酬数据
22nd International Conference on Data Engineering Workshops (ICDEW'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.89
Chengmin Ding, Ping Chen
{"title":"Mining Executive Compensation Data from SEC Filings","authors":"Chengmin Ding, Ping Chen","doi":"10.1109/ICDEW.2006.89","DOIUrl":"https://doi.org/10.1109/ICDEW.2006.89","url":null,"abstract":"In recent years, corporate governance has become an important concern in investment decision-making. As one of the most important factors in evaluating corporate governance, executive compensation study has drawn a lot of attention. Most companies with excessive executive pay are linked with scandals or corporate failures. This paper presents a text mining system ECRS (Executive Compensation Retrieval System) to automatically extract executive compensation data from the SEC (http://www.sec.gov) proxy filing. An analysis based on the extracted data is provided and some samples on using the raw data to derive useful information for the financial analysts are also presented","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130938850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Pragmatics and Open Problems for Inter-schema Constraint Theory 图式间约束理论的语用与开放性问题
22nd International Conference on Data Engineering Workshops (ICDEW'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.111
A. Rosenthal, Leonard J. Seligman
{"title":"Pragmatics and Open Problems for Inter-schema Constraint Theory","authors":"A. Rosenthal, Leonard J. Seligman","doi":"10.1109/ICDEW.2006.111","DOIUrl":"https://doi.org/10.1109/ICDEW.2006.111","url":null,"abstract":"We consider pragmatic issues in applying constraint-based theories (such as that developed for data exchange) to a variety of problems. We identify disconnects between theoreticians and tool developers, and propose principles for creating problem units that are appropriate for tools. Our Downstream Principle then explains why automated schema mapping is a prerequisite to transitioning schemamatching prototypes. Next, we compare concerns, strengths, and weaknesses of database and AI approaches to data exchange. Finally, we discuss how constrained update by business processes is a central difficulty in maintaining n-tier applications, and compare the challenges with data exchange and conventional view update","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131915699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Self-Organizing Search Engine for RSS Syndicated Web Contents RSS联合Web内容的自组织搜索引擎
22nd International Conference on Data Engineering Workshops (ICDEW'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.19
Ying Zhou, Xin Chen, Chen Wang
{"title":"A Self-Organizing Search Engine for RSS Syndicated Web Contents","authors":"Ying Zhou, Xin Chen, Chen Wang","doi":"10.1109/ICDEW.2006.19","DOIUrl":"https://doi.org/10.1109/ICDEW.2006.19","url":null,"abstract":"The exponentially growing information published on the Web relies largely on a few major search engines like Google to be brought to the public nowadays. This raises issues such as: 1. how many percents of coverage do these search engines provide for the whole shared contents over the Internet? 2. how easy is it to find less popular contents from the Web through the page ranking system of these search engines? In fact, the increasing dynamics of the information distributed on the Internet challenge the flexibility of these centralized search engines. With the amount of structured and semi-structured data increase on the Internet, self-organizing search engines that are capable of providing sufficient coverage for data that follow certain structures get more and more attractive. In this paper, we propose a self-organizing search engine soSpace for RSS syndicated web data. soSpace is built on structured peer-to-peer technology. It enables indexing and searching of frequently updated web information described by RSS feed. Our experiment results show that it has good scalability as the contents increase. The recall and precision rate of the result are satisfactory as well.","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131973047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Semantic Model to Integrate Biological Resources 整合生物资源的语义模型
22nd International Conference on Data Engineering Workshops (ICDEW'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.133
Z. Lacroix, L. Raschid, Maria-Esther Vidal
{"title":"Semantic Model to Integrate Biological Resources","authors":"Z. Lacroix, L. Raschid, Maria-Esther Vidal","doi":"10.1109/ICDEW.2006.133","DOIUrl":"https://doi.org/10.1109/ICDEW.2006.133","url":null,"abstract":"We present a framework that uses semantic modeling to represent biological data sources, the multiple links that capture relationships among them, as well as the various applications that transform or analyze biological data. We introduce a data model that encompass three layers: the ontological layer composed of an ontology to represent the scientific concepts and their relationships, the physical layer of the physical resources made available to the scientists, and the data layer composed of the entries accessible through the different resources.","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132074626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Quality Estimation of Local Contents Based on PageRank Values of Web Pages 基于网页PageRank值的本地内容质量估计
22nd International Conference on Data Engineering Workshops (ICDEW'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.121
Y. Kabutoya, T. Yumoto, S. Oyama, Keishi Tajima, Katsumi Tanaka
{"title":"Quality Estimation of Local Contents Based on PageRank Values of Web Pages","authors":"Y. Kabutoya, T. Yumoto, S. Oyama, Keishi Tajima, Katsumi Tanaka","doi":"10.1109/ICDEW.2006.121","DOIUrl":"https://doi.org/10.1109/ICDEW.2006.121","url":null,"abstract":"Recently, it is getting more frequent to search not Web contents but local contents, e.g., by Google Desktop Search. Google succeeded in the Web search because of its PageRank algorithm for the ranking of the search results. PageRank estimates the quality of Web pages based on their popularity, which in turn is estimated by the number and the quality of pages referring to them through hyperlinks. This algorithm, however, is not applicable when we search local contents without link structure, such as text data. In this research, we propose a method to estimate the quality of local contents without link structure by using the PageRank values of Web contents similar to them. Based on this estimation, we can rank the desktop search results. Furthermore, this method enables us to search contents across different resources such as Web contents and local contents. In this paper, we applied this method to Web contents, calculated the scores that estimate their quality, and we compare them with their page quality scores by PageRank.","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134151273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Toward a Query Language for Network Attack Data 网络攻击数据查询语言研究
22nd International Conference on Data Engineering Workshops (ICDEW'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.149
Bee-Chung Chen, V. Yegneswaran, P. Barford, R. Ramakrishnan
{"title":"Toward a Query Language for Network Attack Data","authors":"Bee-Chung Chen, V. Yegneswaran, P. Barford, R. Ramakrishnan","doi":"10.1109/ICDEW.2006.149","DOIUrl":"https://doi.org/10.1109/ICDEW.2006.149","url":null,"abstract":"The growing sophistication and diversity of malicious activity in the Internet presents a serious challenge for network security analysts. In this paper, we describe our efforts to develop a database and query language for network attack data from firewalls, intrusion detection systems and honeynets. Our first step toward this objective is to develop a prototype database and query interface to identify coordinated scanning activity in network attack data. We have created a set of aggregate views and templatized SQL queries that consider timing, persistence, targeted services, spatial dispersion and temporal dispersion, thereby enabling us to evaluate coordinated scanning along these dimensions. We demonstrate the utility of the interface by conducting a case study on a set of firewall and intrusion detection system logs from Dshield.org. We show that the interface is able to identify general characteristics of coordinated activity as well as instances of unusual activity that would otherwise be difficult to mine from the data. These results highlight the potential for developing a more generalized query language for a broad class of network intrusion data. The case study also exposes some of the challenges we face in extending our system to more generalized queries over potentially vast quantities of data.","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134207631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Towards a Quality Model for Effective Data Selection in Collaboratories 面向协作实验室有效数据选择的质量模型
22nd International Conference on Data Engineering Workshops (ICDEW'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.150
Yogesh L. Simmhan, Beth Plale, Dennis Gannon
{"title":"Towards a Quality Model for Effective Data Selection in Collaboratories","authors":"Yogesh L. Simmhan, Beth Plale, Dennis Gannon","doi":"10.1109/ICDEW.2006.150","DOIUrl":"https://doi.org/10.1109/ICDEW.2006.150","url":null,"abstract":"Data-driven scientific applications utilize workflow frameworks to execute complex dataflows, resulting in derived data products of unknown quality. We discuss our on-going research on a quality model that provides users with an integrated estimate of the data quality that is tuned to their application needs and is available as a numerical quality score that enables uniform comparison of datasets, providing a way for the community to trust derived data.","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133253057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信