Image analysis and length estimation of biomolecules using AFM.

Andrew Sundstrom, Silvio Cirrone, Salvatore Paxia, Carlin Hsueh, Rachel Kjolby, James K Gimzewski, Jason Reed, Bud Mishra
{"title":"Image analysis and length estimation of biomolecules using AFM.","authors":"Andrew Sundstrom,&nbsp;Silvio Cirrone,&nbsp;Salvatore Paxia,&nbsp;Carlin Hsueh,&nbsp;Rachel Kjolby,&nbsp;James K Gimzewski,&nbsp;Jason Reed,&nbsp;Bud Mishra","doi":"10.1109/TITB.2012.2206819","DOIUrl":null,"url":null,"abstract":"<p><p>There are many examples of problems in pattern analysis for which it is often possible to obtain systematic characterizations, if in addition a small number of useful features or parameters of the image are known a priori or can be estimated reasonably well. Often the relevant features of a particular pattern analysis problem are easy to enumerate, as when statistical structures of the patterns are well understood from the knowledge of the domain. We study a problem from molecular image analysis, where such a domain-dependent understanding may be lacking to some degree and the features must be inferred via machine-learning techniques. In this paper, we propose a rigorous, fully-automated technique for this problem. We are motivated by an application of atomic force microscopy (AFM) image processing needed to solve a central problem in molecular biology, aimed at obtaining the complete transcription profile of a single cell, a snapshot that shows which genes are being expressed and to what degree. Reed et al (Single molecule transcription profiling with AFM, Nanotechnology, 18:4, 2007) showed the transcription profiling problem reduces to making high-precision measurements of biomolecule backbone lengths, correct to within 20-25 bp (6-7.5 nm). Here we present an image processing and length estimation pipeline using AFM that comes close to achieving these measurement tolerances. In particular, we develop a biased length estimator on trained coefficients of a simple linear regression model, biweighted by a Beaton-Tukey function, whose feature universe is constrained by James-Stein shrinkage to avoid overfitting. In terms of extensibility and addressing the model selection problem, this formulation subsumes the models we studied. </p>","PeriodicalId":55008,"journal":{"name":"IEEE Transactions on Information Technology in Biomedicine","volume":"16 6","pages":"1200-7"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TITB.2012.2206819","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Technology in Biomedicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TITB.2012.2206819","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2012/6/29 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

There are many examples of problems in pattern analysis for which it is often possible to obtain systematic characterizations, if in addition a small number of useful features or parameters of the image are known a priori or can be estimated reasonably well. Often the relevant features of a particular pattern analysis problem are easy to enumerate, as when statistical structures of the patterns are well understood from the knowledge of the domain. We study a problem from molecular image analysis, where such a domain-dependent understanding may be lacking to some degree and the features must be inferred via machine-learning techniques. In this paper, we propose a rigorous, fully-automated technique for this problem. We are motivated by an application of atomic force microscopy (AFM) image processing needed to solve a central problem in molecular biology, aimed at obtaining the complete transcription profile of a single cell, a snapshot that shows which genes are being expressed and to what degree. Reed et al (Single molecule transcription profiling with AFM, Nanotechnology, 18:4, 2007) showed the transcription profiling problem reduces to making high-precision measurements of biomolecule backbone lengths, correct to within 20-25 bp (6-7.5 nm). Here we present an image processing and length estimation pipeline using AFM that comes close to achieving these measurement tolerances. In particular, we develop a biased length estimator on trained coefficients of a simple linear regression model, biweighted by a Beaton-Tukey function, whose feature universe is constrained by James-Stein shrinkage to avoid overfitting. In terms of extensibility and addressing the model selection problem, this formulation subsumes the models we studied.

基于原子力显微镜的生物分子图像分析与长度估计。
在模式分析中有许多问题的例子,如果图像的少量有用的特征或参数是先验的,或者可以很好地估计,那么通常有可能获得系统的特征。当从领域的知识中很好地理解模式的统计结构时,通常很容易列举特定模式分析问题的相关特征。我们研究了一个来自分子图像分析的问题,其中可能在某种程度上缺乏这种依赖于领域的理解,并且必须通过机器学习技术推断特征。在本文中,我们提出了一种严格的、全自动的技术来解决这个问题。我们的动机是原子力显微镜(AFM)图像处理的应用,需要解决分子生物学中的一个核心问题,旨在获得单个细胞的完整转录谱,一个快照,显示哪些基因正在表达,表达到什么程度。Reed等人(单分子转录谱分析与AFM,纳米技术,18:4,2007)表明,转录谱分析问题减少到对生物分子主干长度进行高精度测量,精确到20-25 bp (6-7.5 nm)。在这里,我们提出了一个使用AFM的图像处理和长度估计管道,接近于实现这些测量公差。特别是,我们在简单线性回归模型的训练系数上开发了一个有偏长度估计器,通过Beaton-Tukey函数进行双加权,其特征域受James-Stein收缩约束以避免过拟合。在可扩展性和解决模型选择问题方面,该公式包含了我们研究的模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Information Technology in Biomedicine
IEEE Transactions on Information Technology in Biomedicine 工程技术-计算机:跨学科应用
自引率
0.00%
发文量
1
审稿时长
4.8 months
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信