Daniel Lichtnow, Ronnie Alves, J. Oliveira, A. M. Levin, Ó. Pastor, Ignacio Medina, J. Dopazo
{"title":"Using Papers Citations for Selecting the Best Genomic Databases","authors":"Daniel Lichtnow, Ronnie Alves, J. Oliveira, A. M. Levin, Ó. Pastor, Ignacio Medina, J. Dopazo","doi":"10.1109/SCCC.2011.6","DOIUrl":null,"url":null,"abstract":"Selecting the right data is an essential activity in Genomic-related Information Systems. This work aims to analyze if it is possible to select the best genomic databases from a catalog using information about papers citations related to these genomic databases. The motivation for using information about citations has to do with the fact that it is not easy to obtain proper metadata with respect to these databases. Thus, in this work, information related to papers citations is used for measuring three distinct data quality dimensions: believability, timeliness, and relevancy. Believability is evaluated through the inspection of the number of citations. The variation of the number of citations over time is useful for determining the recency of a database and it is related to the timeliness dimension. Regarding to relevancy, the keywords of papers are useful to indicate the main context of application of these databases.","PeriodicalId":173639,"journal":{"name":"2011 30th International Conference of the Chilean Computer Science Society","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 30th International Conference of the Chilean Computer Science Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCCC.2011.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Selecting the right data is an essential activity in Genomic-related Information Systems. This work aims to analyze if it is possible to select the best genomic databases from a catalog using information about papers citations related to these genomic databases. The motivation for using information about citations has to do with the fact that it is not easy to obtain proper metadata with respect to these databases. Thus, in this work, information related to papers citations is used for measuring three distinct data quality dimensions: believability, timeliness, and relevancy. Believability is evaluated through the inspection of the number of citations. The variation of the number of citations over time is useful for determining the recency of a database and it is related to the timeliness dimension. Regarding to relevancy, the keywords of papers are useful to indicate the main context of application of these databases.