Adaptive clustering of scientific data

Proceeding of 13th IEEE Annual International Phoenix Conference on Computers and Communications Pub Date : 1994-04-12 DOI:10.1109/PCCC.1994.504121

A. Johnson, F. Fotouhi, N. Goel

引用次数: 13

Abstract

Scientific databases contain large amounts of interrelated information. This information is often stored in relational databases with hundreds of tables and thousands of rows per table. Clustering is an effective way to reduce the information-overhead associated with finding information among these tables, allowing the user to browse through the clusters as well as the individual tables. In this paper, we compare the use of two adaptive algorithms (genetic algorithms, and neural networks) in clustering the tables of a scientific database. These clusters allow the user to index into this overwhelming number of tables and find the needed information quickly. We cluster the tables based on the user’s queries and not on the content of the tables, thus the clustering reflects the unique relationships each user sees among the tables. The original database remains untouched, however each user will now have a personalized index into this database.

查看原文本刊更多论文

科学数据的自适应聚类

科学数据库包含大量相互关联的信息。这些信息通常存储在关系数据库中，其中有数百个表，每个表有数千行。集群是一种有效的方法，可以减少与在这些表中查找信息相关的信息开销，允许用户浏览集群和单个表。在本文中，我们比较了两种自适应算法(遗传算法和神经网络)在科学数据库表聚类中的使用。这些集群允许用户对数量庞大的表建立索引，并快速找到所需的信息。我们根据用户的查询而不是表的内容对表进行聚类，因此聚类反映了每个用户在表之间看到的唯一关系。原始数据库保持不变，但是每个用户现在都有一个针对该数据库的个性化索引。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceeding of 13th IEEE Annual International Phoenix Conference on Computers and Communications

自引率

0.00%

发文量