{"title":"Mapping Strategies for Declarative Queries over Online Heterogeneous Biological Databases for Intelligent Responses","authors":"H. Jamil, Kallol Naha","doi":"10.1145/3555776.3577652","DOIUrl":null,"url":null,"abstract":"The emergence of Alexa and Siri, and more recently, OpenAI's Chat-GPT, raises the question whether ad hoc biological queries can also be computed without end-users' active involvement in the code writing process. While advances have been made, current querying architectures for biological databases still assume some degree of computational competence and significant structural awareness of the underlying network of databases by biologists, if not active code writing. Given that biological databases are highly distributed and heterogeneous, and most are not FAIR compliant, a significant amount of expertise in data integration is essential for a query to be accurately crafted and meaningfully executed. In this paper, we introduce a flexible and intelligent query reformulation assistant, called Needle, as a back-end query execution engine of a natural language query interface to online biological databases. Needle leverages a data model called BioStar that leverages a meta-knowledgebase, called the schema graph, to map natural language queries to relevant databases and biological concepts. The implementation of Needle using BioStar is the focus of this article.","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":null,"pages":null},"PeriodicalIF":0.4000,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computing Review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3555776.3577652","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The emergence of Alexa and Siri, and more recently, OpenAI's Chat-GPT, raises the question whether ad hoc biological queries can also be computed without end-users' active involvement in the code writing process. While advances have been made, current querying architectures for biological databases still assume some degree of computational competence and significant structural awareness of the underlying network of databases by biologists, if not active code writing. Given that biological databases are highly distributed and heterogeneous, and most are not FAIR compliant, a significant amount of expertise in data integration is essential for a query to be accurately crafted and meaningfully executed. In this paper, we introduce a flexible and intelligent query reformulation assistant, called Needle, as a back-end query execution engine of a natural language query interface to online biological databases. Needle leverages a data model called BioStar that leverages a meta-knowledgebase, called the schema graph, to map natural language queries to relevant databases and biological concepts. The implementation of Needle using BioStar is the focus of this article.