Zina Ben-Miled, Yang Liu, D. Powers, O. Bukhres, Michael Bem, Robert Jones, Robert J. Oppelt, Sam A. Milosevich
{"title":"大型动态候选药物数据库中的数据访问性能","authors":"Zina Ben-Miled, Yang Liu, D. Powers, O. Bukhres, Michael Bem, Robert Jones, Robert J. Oppelt, Sam A. Milosevich","doi":"10.1109/SC.2000.10049","DOIUrl":null,"url":null,"abstract":"An explosion in the amount of data generated through chemical and biological experimentation has been observed in recent years. This rapid proliferation of vast amounts of data has led to a set of cheminformatics and bioinformatics applications that manipulate dynamic, heterogeneous, and massive data. An example of such applications in the pharmaceutical industry is the computational process involved in the early discovery of lead drug candidates for a given target disease. This computational process includes repeated sequential and random accesses to a drug candidate database. Using the above pharmaceutical application, an experimental study was conducted in this paper that shows that for optimal performance, the degree of parallelism exploited in the application should be adjusted according to the drug candidate database instance size and the machine size. Additionally, different degrees of parallelism should be used depending on whether the access to the drug candidate database is random or sequential.","PeriodicalId":228250,"journal":{"name":"ACM/IEEE SC 2000 Conference (SC'00)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Data Access Performance in a Large and Dynamic Pharmaceutical Drug Candidate Database\",\"authors\":\"Zina Ben-Miled, Yang Liu, D. Powers, O. Bukhres, Michael Bem, Robert Jones, Robert J. Oppelt, Sam A. Milosevich\",\"doi\":\"10.1109/SC.2000.10049\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An explosion in the amount of data generated through chemical and biological experimentation has been observed in recent years. This rapid proliferation of vast amounts of data has led to a set of cheminformatics and bioinformatics applications that manipulate dynamic, heterogeneous, and massive data. An example of such applications in the pharmaceutical industry is the computational process involved in the early discovery of lead drug candidates for a given target disease. This computational process includes repeated sequential and random accesses to a drug candidate database. Using the above pharmaceutical application, an experimental study was conducted in this paper that shows that for optimal performance, the degree of parallelism exploited in the application should be adjusted according to the drug candidate database instance size and the machine size. Additionally, different degrees of parallelism should be used depending on whether the access to the drug candidate database is random or sequential.\",\"PeriodicalId\":228250,\"journal\":{\"name\":\"ACM/IEEE SC 2000 Conference (SC'00)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2000-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM/IEEE SC 2000 Conference (SC'00)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SC.2000.10049\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM/IEEE SC 2000 Conference (SC'00)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC.2000.10049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data Access Performance in a Large and Dynamic Pharmaceutical Drug Candidate Database
An explosion in the amount of data generated through chemical and biological experimentation has been observed in recent years. This rapid proliferation of vast amounts of data has led to a set of cheminformatics and bioinformatics applications that manipulate dynamic, heterogeneous, and massive data. An example of such applications in the pharmaceutical industry is the computational process involved in the early discovery of lead drug candidates for a given target disease. This computational process includes repeated sequential and random accesses to a drug candidate database. Using the above pharmaceutical application, an experimental study was conducted in this paper that shows that for optimal performance, the degree of parallelism exploited in the application should be adjusted according to the drug candidate database instance size and the machine size. Additionally, different degrees of parallelism should be used depending on whether the access to the drug candidate database is random or sequential.