Maleana G Khoury, Kenneth S Berenhaut, Katherine E Moore, Edward E Allen, Alexandria F Harkey, Joëlle K Mühlemann, Courtney N Craven, Jiayi Xu, Suchi S Jain, David J John, James L Norris, Gloria K Muday
{"title":"Informative community structure revealed using Arabidopsis time series transcriptome data via Partitioned Local Depth","authors":"Maleana G Khoury, Kenneth S Berenhaut, Katherine E Moore, Edward E Allen, Alexandria F Harkey, Joëlle K Mühlemann, Courtney N Craven, Jiayi Xu, Suchi S Jain, David J John, James L Norris, Gloria K Muday","doi":"10.1093/insilicoplants/diad018","DOIUrl":null,"url":null,"abstract":"Abstract Transcriptome studies that provide temporal information about transcript abundance facilitate identification of gene regulatory networks (GRNs). Inferring GRNs from time series data using computational modeling remains a central challenge in systems biology. Commonly employed clustering algorithms identify modules of like-responding genes but do not provide information on how these modules are interconnected. These methods also require users to specify parameters such as cluster number and size, adding complexity to the analysis. To address these challenges, we employed a recently developed algorithm, Partitioned Local Depth (PaLD), to generate cohesive networks for 4 time series transcriptome datasets (3 hormone and 1 abiotic stress dataset) from the model plant Arabidopsis thaliana. PaLD provided a cohesive network representation of the data, revealing networks with distinct structures and varying numbers of connections between transcripts. We utilized the networks to make predictions about GRNs by examining local neighborhoods of transcripts with highly similar temporal responses. We also partitioned the networks into groups of like-responding transcripts and identified enriched functional and regulatory features in them. Comparison of groups to clusters generated by commonly used approaches indicated that these methods identified modules of transcripts that have similar temporal and biological features, but also identified unique groups, suggesting a PaLD-based approach (supplemented with a community detection algorithm) can complement existing methods. These results revealed that PaLD could sort like-responding transcripts into biologically meaningful neighborhoods and groups while requiring minimal user input and producing cohesive network structure, offering an additional tool to the systems biology community to predict GRNs.","PeriodicalId":36138,"journal":{"name":"in silico Plants","volume":"14 4","pages":"0"},"PeriodicalIF":2.6000,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"in silico Plants","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/insilicoplants/diad018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract Transcriptome studies that provide temporal information about transcript abundance facilitate identification of gene regulatory networks (GRNs). Inferring GRNs from time series data using computational modeling remains a central challenge in systems biology. Commonly employed clustering algorithms identify modules of like-responding genes but do not provide information on how these modules are interconnected. These methods also require users to specify parameters such as cluster number and size, adding complexity to the analysis. To address these challenges, we employed a recently developed algorithm, Partitioned Local Depth (PaLD), to generate cohesive networks for 4 time series transcriptome datasets (3 hormone and 1 abiotic stress dataset) from the model plant Arabidopsis thaliana. PaLD provided a cohesive network representation of the data, revealing networks with distinct structures and varying numbers of connections between transcripts. We utilized the networks to make predictions about GRNs by examining local neighborhoods of transcripts with highly similar temporal responses. We also partitioned the networks into groups of like-responding transcripts and identified enriched functional and regulatory features in them. Comparison of groups to clusters generated by commonly used approaches indicated that these methods identified modules of transcripts that have similar temporal and biological features, but also identified unique groups, suggesting a PaLD-based approach (supplemented with a community detection algorithm) can complement existing methods. These results revealed that PaLD could sort like-responding transcripts into biologically meaningful neighborhoods and groups while requiring minimal user input and producing cohesive network structure, offering an additional tool to the systems biology community to predict GRNs.
转录组研究提供了转录丰度的时间信息,有助于识别基因调控网络(grn)。利用计算模型从时间序列数据推断grn仍然是系统生物学的核心挑战。常用的聚类算法识别相似响应基因的模块,但不提供这些模块如何相互连接的信息。这些方法还要求用户指定参数,如簇数和大小,这增加了分析的复杂性。为了解决这些挑战,我们采用了最近开发的一种算法,Partitioned Local Depth (PaLD),为来自模式植物拟南芥的4个时间序列转录组数据集(3个激素和1个非生物胁迫数据集)生成内聚网络。PaLD提供了数据的内聚网络表示,揭示了具有不同结构和转录本之间不同数量连接的网络。我们利用该网络通过检查具有高度相似时间响应的转录本的局部邻域来预测grn。我们还将网络划分为类似响应的转录本组,并确定了其中丰富的功能和调控特征。将常用方法生成的组与聚类进行比较表明,这些方法识别出具有相似时间和生物学特征的转录本模块,但也识别出独特的组,这表明基于pald的方法(辅以群落检测算法)可以补充现有方法。这些结果表明,PaLD可以将类似响应的转录本分类到生物学上有意义的邻域和组中,同时需要最少的用户输入并产生内聚的网络结构,为系统生物学社区预测grn提供了额外的工具。