生物信息学门户网站

Encyclopedia of Portal Technologies and Applications Pub Date : 1900-01-01 DOI:10.4018/978-1-59140-989-2.CH013

M. Cannataro

{"title":"生物信息学门户网站","authors":"M. Cannataro","doi":"10.4018/978-1-59140-989-2.CH013","DOIUrl":null,"url":null,"abstract":"Bioinformatics involves the design and development of advanced algorithms and computational platforms to solve problems in biomedicine (Jones & Pevzner, 2004). It also deals with methods for acquiring, storing, retrieving and analysing biological data obtained by querying biological databases or provided by experiments. Bioinformatics applications involve different datasets as well as different software tools and algorithms. Such applications need semantic models for basic software components and need advanced scientific portal services able to aggregate such different components and to hide their details and complexity from the final user. For instance, proteomics applications involve datasets, either produced by experiments or available as public databases, as well as a huge number of different software tools and algorithms. To use such applications it is required to know both biological issues related to data generation and results interpretation and informatics requirements related to data analysis. Bioinformatics applications require platforms that are computationally out of standard. Applications are indeed (1) naturally distributed, due to the high number of involved datasets; (2) require high computing power, due to the large size of datasets and the complexity of basic computations; (3) access heterogeneous data both in format and structure; and finally (5) require reliability and security. For instance, applications such as identification of proteins from spectra data (de Hoffmann & Stroobant, 2002), querying of protein databases (Swiss-Prot), predictions of proteins structures (Guerra & Istrail, 2003), and string-based pattern extraction from large biological sequences, are some examples of computationally expensive applications. Moreover, expertise is required in choosing the most appropriate tools. For instance, protein structure prediction depends on proteins family, so choosing the right tool may strongly influence the experimental results. Recently, there has been much interest from database community and computer science community for bioinformatics. Nevertheless, what is still missing is a high-level environment able to classify tools and provide Web-based easy to use application programming interfaces. In such a way, users can concentrate on the logic of application (i.e., biological aspects) leaving to such platform the work to compose applications, format input data, provide options and parameters, and collect results. Another important requirement is the accessibility of such platform through a Web portal, that is, by using the user interfaces and protocols of the World Wide Web. A bioinformatics Web portal is thus a Web portal that allows access to bioinformatics tools and databases through a Web browser. Moreover, due to the complexity, diversity and a huge number of bioinformatics tools and databases, a bioinformatics Web portal should also support problem formulation, application composition and execution, results visualisation and annotation. A possible approach to solve these issues —high-level modeling and Web-based user interfaces—can be obtained by adding semantics links between biological problems and bioinformatics resources through ontologies (Baker, 1998), and by decoupling Web-based user interfaces from high-performance back-end platforms. In this article we review main requirements of distributed bioinformatics applications and related bioinformatics Web portals, and report the proposal of a grid-based bioinformatics portal allowing choosing and composing of bioinformatics tools with the help of a domain ontology describing data and software resources.","PeriodicalId":349521,"journal":{"name":"Encyclopedia of Portal Technologies and Applications","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Bioinformatics Web Portals\",\"authors\":\"M. Cannataro\",\"doi\":\"10.4018/978-1-59140-989-2.CH013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bioinformatics involves the design and development of advanced algorithms and computational platforms to solve problems in biomedicine (Jones & Pevzner, 2004). It also deals with methods for acquiring, storing, retrieving and analysing biological data obtained by querying biological databases or provided by experiments. Bioinformatics applications involve different datasets as well as different software tools and algorithms. Such applications need semantic models for basic software components and need advanced scientific portal services able to aggregate such different components and to hide their details and complexity from the final user. For instance, proteomics applications involve datasets, either produced by experiments or available as public databases, as well as a huge number of different software tools and algorithms. To use such applications it is required to know both biological issues related to data generation and results interpretation and informatics requirements related to data analysis. Bioinformatics applications require platforms that are computationally out of standard. Applications are indeed (1) naturally distributed, due to the high number of involved datasets; (2) require high computing power, due to the large size of datasets and the complexity of basic computations; (3) access heterogeneous data both in format and structure; and finally (5) require reliability and security. For instance, applications such as identification of proteins from spectra data (de Hoffmann & Stroobant, 2002), querying of protein databases (Swiss-Prot), predictions of proteins structures (Guerra & Istrail, 2003), and string-based pattern extraction from large biological sequences, are some examples of computationally expensive applications. Moreover, expertise is required in choosing the most appropriate tools. For instance, protein structure prediction depends on proteins family, so choosing the right tool may strongly influence the experimental results. Recently, there has been much interest from database community and computer science community for bioinformatics. Nevertheless, what is still missing is a high-level environment able to classify tools and provide Web-based easy to use application programming interfaces. In such a way, users can concentrate on the logic of application (i.e., biological aspects) leaving to such platform the work to compose applications, format input data, provide options and parameters, and collect results. Another important requirement is the accessibility of such platform through a Web portal, that is, by using the user interfaces and protocols of the World Wide Web. A bioinformatics Web portal is thus a Web portal that allows access to bioinformatics tools and databases through a Web browser. Moreover, due to the complexity, diversity and a huge number of bioinformatics tools and databases, a bioinformatics Web portal should also support problem formulation, application composition and execution, results visualisation and annotation. A possible approach to solve these issues —high-level modeling and Web-based user interfaces—can be obtained by adding semantics links between biological problems and bioinformatics resources through ontologies (Baker, 1998), and by decoupling Web-based user interfaces from high-performance back-end platforms. In this article we review main requirements of distributed bioinformatics applications and related bioinformatics Web portals, and report the proposal of a grid-based bioinformatics portal allowing choosing and composing of bioinformatics tools with the help of a domain ontology describing data and software resources.\",\"PeriodicalId\":349521,\"journal\":{\"name\":\"Encyclopedia of Portal Technologies and Applications\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Encyclopedia of Portal Technologies and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/978-1-59140-989-2.CH013\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Encyclopedia of Portal Technologies and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/978-1-59140-989-2.CH013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

生物信息学涉及设计和开发先进的算法和计算平台来解决生物医学中的问题(Jones & Pevzner, 2004)。它还涉及获取、存储、检索和分析通过查询生物数据库获得的或由实验提供的生物数据的方法。生物信息学应用涉及不同的数据集以及不同的软件工具和算法。这些应用程序需要基本软件组件的语义模型，还需要能够聚合这些不同组件并对最终用户隐藏其细节和复杂性的高级科学门户服务。例如，蛋白质组学应用涉及数据集，这些数据集要么是由实验产生的，要么是作为公共数据库可用的，以及大量不同的软件工具和算法。要使用这些应用程序，需要了解与数据生成和结果解释相关的生物学问题以及与数据分析相关的信息学要求。生物信息学应用需要计算上不符合标准的平台。应用程序确实是(1)自然分布的，因为涉及大量的数据集;(2)由于数据集规模大，基础计算复杂，需要较高的计算能力;(3)访问格式和结构上的异构数据;最后(5)要求可靠性和安全性。例如，从光谱数据中识别蛋白质(de Hoffmann & Stroobant, 2002)、查询蛋白质数据库(Swiss-Prot)、预测蛋白质结构(Guerra & istail, 2003)以及从大型生物序列中提取基于字符串的模式等应用都是计算成本较高的应用。此外，在选择最合适的工具时需要专门知识。例如，蛋白质结构预测依赖于蛋白质家族，因此选择合适的工具可能会严重影响实验结果。近年来，数据库界和计算机科学界对生物信息学产生了浓厚的兴趣。然而，仍然缺少一个能够对工具进行分类并提供基于web的易于使用的应用程序编程接口的高级环境。通过这种方式，用户可以专注于应用程序的逻辑(即生物方面)，而将编写应用程序、格式化输入数据、提供选项和参数以及收集结果的工作交给该平台。另一个重要的需求是通过Web门户访问这样的平台，即通过使用万维网的用户界面和协议。因此，生物信息学Web门户是一个允许通过Web浏览器访问生物信息学工具和数据库的Web门户。此外，由于生物信息学工具和数据库的复杂性、多样性和数量庞大，生物信息学门户网站还应支持问题的提出、应用程序的组成和执行、结果的可视化和注释。解决这些问题的一种可能的方法——高级建模和基于web的用户界面——可以通过本体在生物问题和生物信息学资源之间添加语义链接(Baker, 1998)，以及将基于web的用户界面与高性能后端平台解耦来获得。在本文中，我们回顾了分布式生物信息学应用和相关生物信息学门户的主要需求，并报告了一个基于网格的生物信息学门户的建议，该门户允许在描述数据和软件资源的领域本体的帮助下选择和组成生物信息学工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Bioinformatics Web Portals

Bioinformatics involves the design and development of advanced algorithms and computational platforms to solve problems in biomedicine (Jones & Pevzner, 2004). It also deals with methods for acquiring, storing, retrieving and analysing biological data obtained by querying biological databases or provided by experiments. Bioinformatics applications involve different datasets as well as different software tools and algorithms. Such applications need semantic models for basic software components and need advanced scientific portal services able to aggregate such different components and to hide their details and complexity from the final user. For instance, proteomics applications involve datasets, either produced by experiments or available as public databases, as well as a huge number of different software tools and algorithms. To use such applications it is required to know both biological issues related to data generation and results interpretation and informatics requirements related to data analysis. Bioinformatics applications require platforms that are computationally out of standard. Applications are indeed (1) naturally distributed, due to the high number of involved datasets; (2) require high computing power, due to the large size of datasets and the complexity of basic computations; (3) access heterogeneous data both in format and structure; and finally (5) require reliability and security. For instance, applications such as identification of proteins from spectra data (de Hoffmann & Stroobant, 2002), querying of protein databases (Swiss-Prot), predictions of proteins structures (Guerra & Istrail, 2003), and string-based pattern extraction from large biological sequences, are some examples of computationally expensive applications. Moreover, expertise is required in choosing the most appropriate tools. For instance, protein structure prediction depends on proteins family, so choosing the right tool may strongly influence the experimental results. Recently, there has been much interest from database community and computer science community for bioinformatics. Nevertheless, what is still missing is a high-level environment able to classify tools and provide Web-based easy to use application programming interfaces. In such a way, users can concentrate on the logic of application (i.e., biological aspects) leaving to such platform the work to compose applications, format input data, provide options and parameters, and collect results. Another important requirement is the accessibility of such platform through a Web portal, that is, by using the user interfaces and protocols of the World Wide Web. A bioinformatics Web portal is thus a Web portal that allows access to bioinformatics tools and databases through a Web browser. Moreover, due to the complexity, diversity and a huge number of bioinformatics tools and databases, a bioinformatics Web portal should also support problem formulation, application composition and execution, results visualisation and annotation. A possible approach to solve these issues —high-level modeling and Web-based user interfaces—can be obtained by adding semantics links between biological problems and bioinformatics resources through ontologies (Baker, 1998), and by decoupling Web-based user interfaces from high-performance back-end platforms. In this article we review main requirements of distributed bioinformatics applications and related bioinformatics Web portals, and report the proposal of a grid-based bioinformatics portal allowing choosing and composing of bioinformatics tools with the help of a domain ontology describing data and software resources.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Encyclopedia of Portal Technologies and Applications

自引率

0.00%

发文量