{"title":"基于傅里叶变换的哈希函数快速蛋白质片段搜索。","authors":"T Akutsu, K Onizuka, M Ishikawa","doi":"10.1093/bioinformatics/13.4.357","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Since the protein structure database has been growing very rapidly in recent years, the development of efficient methods for searching for similar structures is very important.</p><p><strong>Results: </strong>This paper presents a novel method for searching for similar fragments of proteins. In this method, a hash vector (a vector of real numbers) is associated with each fixed-length fragment of three-dimensional protein structure. Each vector consists of low-frequency components of the Fourier-like spectrum for the distances between C alpha atoms and the centroid. Then, we can analyze the similarity between fragments by evaluating the difference between hash vectors. The novel aspect of the method is that the following property is proved theoretically: if the root mean square distance between two fragments is small, then the distance between the hash vectors is small. Several variants of this method were compared with a naive method and a previous method using PDB data. The results show that the fastest one among the variants is 18-80 times faster than the naive method, and 3-10 times faster than the previous method.</p>","PeriodicalId":77081,"journal":{"name":"Computer applications in the biosciences : CABIOS","volume":"13 4","pages":"357-64"},"PeriodicalIF":0.0000,"publicationDate":"1997-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/bioinformatics/13.4.357","citationCount":"2","resultStr":"{\"title\":\"Rapid protein fragment search using hash functions based on the Fourier transform.\",\"authors\":\"T Akutsu, K Onizuka, M Ishikawa\",\"doi\":\"10.1093/bioinformatics/13.4.357\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Motivation: </strong>Since the protein structure database has been growing very rapidly in recent years, the development of efficient methods for searching for similar structures is very important.</p><p><strong>Results: </strong>This paper presents a novel method for searching for similar fragments of proteins. In this method, a hash vector (a vector of real numbers) is associated with each fixed-length fragment of three-dimensional protein structure. Each vector consists of low-frequency components of the Fourier-like spectrum for the distances between C alpha atoms and the centroid. Then, we can analyze the similarity between fragments by evaluating the difference between hash vectors. The novel aspect of the method is that the following property is proved theoretically: if the root mean square distance between two fragments is small, then the distance between the hash vectors is small. Several variants of this method were compared with a naive method and a previous method using PDB data. The results show that the fastest one among the variants is 18-80 times faster than the naive method, and 3-10 times faster than the previous method.</p>\",\"PeriodicalId\":77081,\"journal\":{\"name\":\"Computer applications in the biosciences : CABIOS\",\"volume\":\"13 4\",\"pages\":\"357-64\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1093/bioinformatics/13.4.357\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer applications in the biosciences : CABIOS\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/13.4.357\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer applications in the biosciences : CABIOS","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/13.4.357","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Rapid protein fragment search using hash functions based on the Fourier transform.
Motivation: Since the protein structure database has been growing very rapidly in recent years, the development of efficient methods for searching for similar structures is very important.
Results: This paper presents a novel method for searching for similar fragments of proteins. In this method, a hash vector (a vector of real numbers) is associated with each fixed-length fragment of three-dimensional protein structure. Each vector consists of low-frequency components of the Fourier-like spectrum for the distances between C alpha atoms and the centroid. Then, we can analyze the similarity between fragments by evaluating the difference between hash vectors. The novel aspect of the method is that the following property is proved theoretically: if the root mean square distance between two fragments is small, then the distance between the hash vectors is small. Several variants of this method were compared with a naive method and a previous method using PDB data. The results show that the fastest one among the variants is 18-80 times faster than the naive method, and 3-10 times faster than the previous method.