Data-oriented research for bioresource utilization: A case study to investigate water uptake in cellulose using Principal Components

L. Ling, C. Driemeier, R. M. C. Junior
{"title":"Data-oriented research for bioresource utilization: A case study to investigate water uptake in cellulose using Principal Components","authors":"L. Ling, C. Driemeier, R. M. C. Junior","doi":"10.1109/eScience.2012.6404485","DOIUrl":null,"url":null,"abstract":"Bioresource utilization represents an important interdisciplinary research that integrates academic and industrial expertise across diverse scientific domains, including physics, chemistry, biology, and engineering. The present paper describes a cyber-infrastructure being created at the Brazilian Bioethanol Science and Technology Laboratory (CTBE) to assist scientists working on the field. One key element of the infrastructure is the LignoCel Platform, a tailor-made database for upload, curation, and sharing of lignocellulose data. Particularly, LignoCel allows querying the data and exporting subsets that are analyzed for knowledge extraction. In the present paper, a case-study is described, in which scientists want to investigate the dimensions that relate cellulose structure and water uptake. Data analysis and dimensionality reduction using Principal Component Analysis (PCA) is employed. Different PCA-based measurements are extracted and visualized through automatically-generated HTML pages available for the domain scientists. In this case study, the workflow successfully provided dimensionality reduction from a data matrix originated from a heterogeneous set of materials. PCA scores and loadings are explored for data analysis and visualization. PCA reduced the 11 measured features (obtained from three different experimental techniques, 55 possible combinations of size 2) into a two-dimensional PC1PC2 loadings plot representing 89% of data variance. Examples of the output produced by the system are available at http://data.bioetanol.org. br/~liu.ling/pca-lignocel/.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"2 1","pages":"1-7"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 8th International Conference on E-Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/eScience.2012.6404485","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Bioresource utilization represents an important interdisciplinary research that integrates academic and industrial expertise across diverse scientific domains, including physics, chemistry, biology, and engineering. The present paper describes a cyber-infrastructure being created at the Brazilian Bioethanol Science and Technology Laboratory (CTBE) to assist scientists working on the field. One key element of the infrastructure is the LignoCel Platform, a tailor-made database for upload, curation, and sharing of lignocellulose data. Particularly, LignoCel allows querying the data and exporting subsets that are analyzed for knowledge extraction. In the present paper, a case-study is described, in which scientists want to investigate the dimensions that relate cellulose structure and water uptake. Data analysis and dimensionality reduction using Principal Component Analysis (PCA) is employed. Different PCA-based measurements are extracted and visualized through automatically-generated HTML pages available for the domain scientists. In this case study, the workflow successfully provided dimensionality reduction from a data matrix originated from a heterogeneous set of materials. PCA scores and loadings are explored for data analysis and visualization. PCA reduced the 11 measured features (obtained from three different experimental techniques, 55 possible combinations of size 2) into a two-dimensional PC1PC2 loadings plot representing 89% of data variance. Examples of the output produced by the system are available at http://data.bioetanol.org. br/~liu.ling/pca-lignocel/.
面向数据的生物资源利用研究:利用主成分研究纤维素水分吸收的案例研究
生物资源利用是一项重要的跨学科研究,它整合了物理、化学、生物和工程等不同科学领域的学术和工业专业知识。这篇论文描述了巴西生物乙醇科学技术实验室(CTBE)正在创建的一个网络基础设施,以帮助在该领域工作的科学家。该基础设施的一个关键要素是LignoCel平台,这是一个定制的数据库,用于上传、管理和共享木质纤维素数据。特别是,LignoCel允许查询数据和导出用于知识提取的分析子集。在本文中,一个案例研究被描述,其中科学家想要调查有关纤维素结构和水摄取的尺寸。采用主成分分析(PCA)进行数据分析和降维。通过为领域科学家提供的自动生成的HTML页面提取和可视化不同的基于pca的测量。在这个案例研究中,工作流成功地提供了来自一组异构材料的数据矩阵的降维。探讨了PCA分数和加载的数据分析和可视化。PCA将11个测量特征(从三种不同的实验技术中获得,55种可能的大小2组合)减少到一个二维PC1PC2加载图,代表89%的数据方差。该系统产生的输出示例可在http://data.bioetanol.org上获得。br / ~ liu.ling / pca-lignocel /。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信