Prediction and functional characterization of transcriptional activation domains

Saloni Mahatma, Lisa Van den Broeck, Nicholas Morffy, M. Staller, L. Strader, Rosangela Sozzani
{"title":"Prediction and functional characterization of transcriptional activation domains","authors":"Saloni Mahatma, Lisa Van den Broeck, Nicholas Morffy, M. Staller, L. Strader, Rosangela Sozzani","doi":"10.1109/CISS56502.2023.10089768","DOIUrl":null,"url":null,"abstract":"Gene expression is induced by transcription factors (TFs) through their activation domains (ADs). However, ADs are unconserved, intrinsically disordered sequences without a secondary structure, making it challenging to recognize and predict these regions and limiting our ability to identify TFs. Here, we address this challenge by leveraging a neural network approach to systematically predict ADs. As input for our neural network, we used computed properties for amino acid (AA) side chain and secondary structure, rather than relying on the raw sequence. Moreover, to shed light on the features learned by our neural network and greatly increase interpretability, we computed the input properties most important for an accurate prediction. Our findings further highlight the importance of aromatic and negatively charged AA and reveal the importance of unknown AA properties. Taking advantage of these most important features, we used an unsupervised learning approach to classify the ADs into 10 subclasses, which can further be explored for AA specificity and AD functionality. Overall, our pipeline, relying on supervised and unsupervised machine learning, shed light on the non-linear properties of ADs.","PeriodicalId":243775,"journal":{"name":"2023 57th Annual Conference on Information Sciences and Systems (CISS)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 57th Annual Conference on Information Sciences and Systems (CISS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISS56502.2023.10089768","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Gene expression is induced by transcription factors (TFs) through their activation domains (ADs). However, ADs are unconserved, intrinsically disordered sequences without a secondary structure, making it challenging to recognize and predict these regions and limiting our ability to identify TFs. Here, we address this challenge by leveraging a neural network approach to systematically predict ADs. As input for our neural network, we used computed properties for amino acid (AA) side chain and secondary structure, rather than relying on the raw sequence. Moreover, to shed light on the features learned by our neural network and greatly increase interpretability, we computed the input properties most important for an accurate prediction. Our findings further highlight the importance of aromatic and negatively charged AA and reveal the importance of unknown AA properties. Taking advantage of these most important features, we used an unsupervised learning approach to classify the ADs into 10 subclasses, which can further be explored for AA specificity and AD functionality. Overall, our pipeline, relying on supervised and unsupervised machine learning, shed light on the non-linear properties of ADs.
转录激活域的预测和功能表征
基因表达是由转录因子(TFs)通过其激活域(ADs)诱导的。然而,ADs是非保守的,本质上无序的序列,没有二级结构,这使得识别和预测这些区域具有挑战性,并限制了我们识别tf的能力。在这里,我们通过利用神经网络方法系统地预测ADs来解决这一挑战。作为神经网络的输入,我们使用了氨基酸(AA)侧链和二级结构的计算性质,而不是依赖于原始序列。此外,为了阐明我们的神经网络学习的特征并大大提高可解释性,我们计算了对准确预测最重要的输入属性。我们的发现进一步强调了芳香和带负电荷的AA的重要性,并揭示了未知AA性质的重要性。利用这些最重要的特征,我们使用无监督学习方法将AD分为10个子类,可以进一步探索AA特异性和AD功能。总的来说,我们的管道,依赖于监督和无监督的机器学习,揭示了ADs的非线性特性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信