DANCE: Deep Learning-Assisted Analysis of Protein Sequences Using Chaos Enhanced Kaleidoscopic Images

Taslim Murad, Prakash Chourasia, Sarwan Ali, Murray Patterson
{"title":"DANCE: Deep Learning-Assisted Analysis of Protein Sequences Using Chaos Enhanced Kaleidoscopic Images","authors":"Taslim Murad, Prakash Chourasia, Sarwan Ali, Murray Patterson","doi":"arxiv-2409.06694","DOIUrl":null,"url":null,"abstract":"Cancer is a complex disease characterized by uncontrolled cell growth. T cell\nreceptors (TCRs), crucial proteins in the immune system, play a key role in\nrecognizing antigens, including those associated with cancer. Recent\nadvancements in sequencing technologies have facilitated comprehensive\nprofiling of TCR repertoires, uncovering TCRs with potent anti-cancer activity\nand enabling TCR-based immunotherapies. However, analyzing these intricate\nbiomolecules necessitates efficient representations that capture their\nstructural and functional information. T-cell protein sequences pose unique\nchallenges due to their relatively smaller lengths compared to other\nbiomolecules. An image-based representation approach becomes a preferred choice\nfor efficient embeddings, allowing for the preservation of essential details\nand enabling comprehensive analysis of T-cell protein sequences. In this paper,\nwe propose to generate images from the protein sequences using the idea of\nChaos Game Representation (CGR) using the Kaleidoscopic images approach. This\nDeep Learning Assisted Analysis of Protein Sequences Using Chaos Enhanced\nKaleidoscopic Images (called DANCE) provides a unique way to visualize protein\nsequences by recursively applying chaos game rules around a central seed point.\nwe perform the classification of the T cell receptors (TCRs) protein sequences\nin terms of their respective target cancer cells, as TCRs are known for their\nimmune response against cancer disease. The TCR sequences are converted into\nimages using the DANCE method. We employ deep-learning vision models to perform\nthe classification to obtain insights into the relationship between the visual\npatterns observed in the generated kaleidoscopic images and the underlying\nprotein properties. By combining CGR-based image generation with deep learning\nclassification, this study opens novel possibilities in the protein analysis\ndomain.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Quantitative Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06694","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Cancer is a complex disease characterized by uncontrolled cell growth. T cell receptors (TCRs), crucial proteins in the immune system, play a key role in recognizing antigens, including those associated with cancer. Recent advancements in sequencing technologies have facilitated comprehensive profiling of TCR repertoires, uncovering TCRs with potent anti-cancer activity and enabling TCR-based immunotherapies. However, analyzing these intricate biomolecules necessitates efficient representations that capture their structural and functional information. T-cell protein sequences pose unique challenges due to their relatively smaller lengths compared to other biomolecules. An image-based representation approach becomes a preferred choice for efficient embeddings, allowing for the preservation of essential details and enabling comprehensive analysis of T-cell protein sequences. In this paper, we propose to generate images from the protein sequences using the idea of Chaos Game Representation (CGR) using the Kaleidoscopic images approach. This Deep Learning Assisted Analysis of Protein Sequences Using Chaos Enhanced Kaleidoscopic Images (called DANCE) provides a unique way to visualize protein sequences by recursively applying chaos game rules around a central seed point. we perform the classification of the T cell receptors (TCRs) protein sequences in terms of their respective target cancer cells, as TCRs are known for their immune response against cancer disease. The TCR sequences are converted into images using the DANCE method. We employ deep-learning vision models to perform the classification to obtain insights into the relationship between the visual patterns observed in the generated kaleidoscopic images and the underlying protein properties. By combining CGR-based image generation with deep learning classification, this study opens novel possibilities in the protein analysis domain.
DANCE:利用混沌增强万花筒图像进行深度学习辅助蛋白质序列分析
癌症是一种复杂的疾病,其特点是细胞生长失控。T 细胞受体(TCR)是免疫系统中的关键蛋白,在识别抗原(包括与癌症相关的抗原)方面发挥着关键作用。测序技术的最新进展促进了对 TCR 重排的全面分析,发现了具有强大抗癌活性的 TCR,并促成了基于 TCR 的免疫疗法。然而,分析这些错综复杂的生物大分子需要高效的表征方法来捕捉它们的结构和功能信息。与其他生物大分子相比,T 细胞蛋白质序列的长度相对较小,这给分析带来了独特的挑战。基于图像的表示方法成为高效嵌入的首选,它可以保留重要细节,并实现对 T 细胞蛋白质序列的全面分析。在本文中,我们提出利用万花筒图像方法,利用混沌博弈表示(CGR)的思想从蛋白质序列生成图像。这种使用混沌增强万花筒图像对蛋白质序列进行深度学习辅助分析(Deep Learning Assisted Analysis of Protein Sequences Using Chaos EnhancedKaleidoscopic Images,简称 DANCE)提供了一种独特的方法,通过围绕中心种子点递归应用混沌博弈规则,将蛋白质序列可视化。我们使用 DANCE 方法将 TCR 序列转换为图像。我们采用深度学习视觉模型进行分类,以便深入了解在生成的万花筒图像中观察到的视觉模式与潜在蛋白质特性之间的关系。通过将基于 CGR 的图像生成与深度学习分类相结合,这项研究为蛋白质分析领域开辟了新的可能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信