An application of hash codes for disk file access, using multiple buckets

ACM-SE 14 Pub Date : 1976-04-22 DOI:10.1145/503561.503598

William L. Flack

引用次数: 0

Abstract

Hash code methods are widely used for retrieval of information from tables in memory and from direct access storage devices. A key is input to an algorithm which calculates the memory location or the disk address wanted. This paper explains hash code methods for direct disk access by way of a particular application example.The application includes the use of multiple buckets, where each bucket is a separate disk file. Synonym overflow is handled by trying to place a record only once in each bucket (file), and finally placing the record in an overflow bucket (file) if no place could be found in the primary files.The main goal of this design was to utilize 90-95% of the allocated disk space before the average access time became significantly degraded. This is in contrast with the usual requirement for hash code disk access in a single large file that there be about 20% excess space over the amount actually needed.The application was first implemented on an IBM 1130 and was originally conceived to overcome limitations on the size of a single physical file on that machine. It is now running on a Hewlett Packard 3000. The file capacity is 18,000 optometric clinic patient records.

查看原文本刊更多论文

一个应用程序的哈希码磁盘文件访问，使用多个桶

哈希码方法广泛用于从内存中的表和直接访问存储设备中检索信息。一个键被输入到一个算法中，该算法计算所需的内存位置或磁盘地址。本文通过一个特定的应用示例解释了直接磁盘访问的哈希码方法。应用程序包括使用多个存储桶，其中每个存储桶是一个单独的磁盘文件。同义词溢出的处理方法是，尝试在每个桶(文件)中只放置一条记录一次，如果在主文件中找不到位置，最后将记录放置在溢出桶(文件)中。这种设计的主要目标是在平均访问时间显著降低之前，利用分配的磁盘空间的90-95%。这与在单个大文件中访问哈希码磁盘的通常需求形成了对比，后者比实际需要的空间多出了大约20%。该应用程序首先在IBM 1130上实现，最初的设想是克服该机器上单个物理文件大小的限制。它现在运行在惠普3000上。文件容量为18,000个验光门诊患者记录。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM-SE 14

自引率

0.00%

发文量