Samuel Galvao Elias;Debora Cervieri Guterres;Robert Weingart Barreto;Helson Mario Martins do Vale
{"title":"GeneConnector:释放基因库元数据的全部潜力","authors":"Samuel Galvao Elias;Debora Cervieri Guterres;Robert Weingart Barreto;Helson Mario Martins do Vale","doi":"10.1109/TLA.2024.10412034","DOIUrl":null,"url":null,"abstract":"Genbank currently stands as one of the most significant global repositories of genetic information. However, despite its vast quantity and diversity of data, a considerable portion of the existing records suffer from disjointed and often lacking metadata, failing to provide the necessary context of their acquisition. In light of this, we propose GeneConnector, a tool that harnesses shared information among multiple records of the same specimen in Genbank, aiming to enhance the completeness of poorly annotated nodes across various information domains. To demonstrate the tools capabilities, we conducted a comprehensive review and aggregation of available data using the Genbank database of Genera of Phytopathogenic Fungi (GOPHY). Through our evaluation, we observed substantial gains in information by analyzing shared data among nodes connecting Genbank specimen records, resulting in impressive increments ranging from 2% to a remarkable 60%. Our approach empowers users to make precise, straightforward, and accurate assessments of the context associated to results, facilitated by two metrics that gauge the current level of data annotation and the potential information gain achievable following our evaluation.","PeriodicalId":55024,"journal":{"name":"IEEE Latin America Transactions","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10412034","citationCount":"0","resultStr":"{\"title\":\"GeneConnector: Unlocking the full potential of Genbank metadata\",\"authors\":\"Samuel Galvao Elias;Debora Cervieri Guterres;Robert Weingart Barreto;Helson Mario Martins do Vale\",\"doi\":\"10.1109/TLA.2024.10412034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Genbank currently stands as one of the most significant global repositories of genetic information. However, despite its vast quantity and diversity of data, a considerable portion of the existing records suffer from disjointed and often lacking metadata, failing to provide the necessary context of their acquisition. In light of this, we propose GeneConnector, a tool that harnesses shared information among multiple records of the same specimen in Genbank, aiming to enhance the completeness of poorly annotated nodes across various information domains. To demonstrate the tools capabilities, we conducted a comprehensive review and aggregation of available data using the Genbank database of Genera of Phytopathogenic Fungi (GOPHY). Through our evaluation, we observed substantial gains in information by analyzing shared data among nodes connecting Genbank specimen records, resulting in impressive increments ranging from 2% to a remarkable 60%. Our approach empowers users to make precise, straightforward, and accurate assessments of the context associated to results, facilitated by two metrics that gauge the current level of data annotation and the potential information gain achievable following our evaluation.\",\"PeriodicalId\":55024,\"journal\":{\"name\":\"IEEE Latin America Transactions\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2024-01-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10412034\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Latin America Transactions\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10412034/\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Latin America Transactions","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10412034/","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
GeneConnector: Unlocking the full potential of Genbank metadata
Genbank currently stands as one of the most significant global repositories of genetic information. However, despite its vast quantity and diversity of data, a considerable portion of the existing records suffer from disjointed and often lacking metadata, failing to provide the necessary context of their acquisition. In light of this, we propose GeneConnector, a tool that harnesses shared information among multiple records of the same specimen in Genbank, aiming to enhance the completeness of poorly annotated nodes across various information domains. To demonstrate the tools capabilities, we conducted a comprehensive review and aggregation of available data using the Genbank database of Genera of Phytopathogenic Fungi (GOPHY). Through our evaluation, we observed substantial gains in information by analyzing shared data among nodes connecting Genbank specimen records, resulting in impressive increments ranging from 2% to a remarkable 60%. Our approach empowers users to make precise, straightforward, and accurate assessments of the context associated to results, facilitated by two metrics that gauge the current level of data annotation and the potential information gain achievable following our evaluation.
期刊介绍:
IEEE Latin America Transactions (IEEE LATAM) is an interdisciplinary journal focused on the dissemination of original and quality research papers / review articles in Spanish and Portuguese of emerging topics in three main areas: Computing, Electric Energy and Electronics. Some of the sub-areas of the journal are, but not limited to: Automatic control, communications, instrumentation, artificial intelligence, power and industrial electronics, fault diagnosis and detection, transportation electrification, internet of things, electrical machines, circuits and systems, biomedicine and biomedical / haptic applications, secure communications, robotics, sensors and actuators, computer networks, smart grids, among others.