Jeyashree Krishnan, Zeyu Lian, Pieter E. Oomen, Xiulan He, Soodabeh Majdi, Andreas Schuppert, Andrew Ewing
{"title":"基于树形学习的安培时间序列数据分类精度高","authors":"Jeyashree Krishnan, Zeyu Lian, Pieter E. Oomen, Xiulan He, Soodabeh Majdi, Andreas Schuppert, Andrew Ewing","doi":"arxiv-2302.02650","DOIUrl":null,"url":null,"abstract":"Elucidating exocytosis processes provide insights into cellular\nneurotransmission mechanisms, and may have potential in neurodegenerative\ndiseases research. Amperometry is an established electrochemical method for the\ndetection of neurotransmitters released from and stored inside cells. An\nimportant aspect of the amperometry method is the sub-millisecond temporal\nresolution of the current recordings which leads to several hundreds of\ngigabytes of high-quality data. In this study, we present a universal method\nfor the classification with respect to diverse amperometric datasets using\ndata-driven approaches in computational science. We demonstrate a very high\nprediction accuracy (greater than or equal to 95%). This includes an end-to-end\nsystematic machine learning workflow for amperometric time series datasets\nconsisting of pre-processing; feature extraction; model identification;\ntraining and testing; followed by feature importance evaluation - all\nimplemented. We tested the method on heterogeneous amperometric time series\ndatasets generated using different experimental approaches, chemical\nstimulations, electrode types, and varying recording times. We identified a\ncertain overarching set of common features across these datasets which enables\naccurate predictions. Further, we showed that information relevant for the\nclassification of amperometric traces are neither in the spiky segments alone,\nnor can it be retrieved from just the temporal structure of spikes. In fact,\nthe transients between spikes and the trace baselines carry essential\ninformation for a successful classification, thereby strongly demonstrating\nthat an effective feature representation of amperometric time series requires\nthe full time series. To our knowledge, this is one of the first studies that\npropose a scheme for machine learning, and in particular, supervised learning\non full amperometry time series data.","PeriodicalId":501170,"journal":{"name":"arXiv - QuanBio - Subcellular Processes","volume":"82 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tree-Based Learning on Amperometric Time Series Data Demonstrates High Accuracy for Classification\",\"authors\":\"Jeyashree Krishnan, Zeyu Lian, Pieter E. Oomen, Xiulan He, Soodabeh Majdi, Andreas Schuppert, Andrew Ewing\",\"doi\":\"arxiv-2302.02650\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Elucidating exocytosis processes provide insights into cellular\\nneurotransmission mechanisms, and may have potential in neurodegenerative\\ndiseases research. Amperometry is an established electrochemical method for the\\ndetection of neurotransmitters released from and stored inside cells. An\\nimportant aspect of the amperometry method is the sub-millisecond temporal\\nresolution of the current recordings which leads to several hundreds of\\ngigabytes of high-quality data. In this study, we present a universal method\\nfor the classification with respect to diverse amperometric datasets using\\ndata-driven approaches in computational science. We demonstrate a very high\\nprediction accuracy (greater than or equal to 95%). This includes an end-to-end\\nsystematic machine learning workflow for amperometric time series datasets\\nconsisting of pre-processing; feature extraction; model identification;\\ntraining and testing; followed by feature importance evaluation - all\\nimplemented. We tested the method on heterogeneous amperometric time series\\ndatasets generated using different experimental approaches, chemical\\nstimulations, electrode types, and varying recording times. We identified a\\ncertain overarching set of common features across these datasets which enables\\naccurate predictions. Further, we showed that information relevant for the\\nclassification of amperometric traces are neither in the spiky segments alone,\\nnor can it be retrieved from just the temporal structure of spikes. In fact,\\nthe transients between spikes and the trace baselines carry essential\\ninformation for a successful classification, thereby strongly demonstrating\\nthat an effective feature representation of amperometric time series requires\\nthe full time series. To our knowledge, this is one of the first studies that\\npropose a scheme for machine learning, and in particular, supervised learning\\non full amperometry time series data.\",\"PeriodicalId\":501170,\"journal\":{\"name\":\"arXiv - QuanBio - Subcellular Processes\",\"volume\":\"82 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Subcellular Processes\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2302.02650\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Subcellular Processes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2302.02650","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Tree-Based Learning on Amperometric Time Series Data Demonstrates High Accuracy for Classification
Elucidating exocytosis processes provide insights into cellular
neurotransmission mechanisms, and may have potential in neurodegenerative
diseases research. Amperometry is an established electrochemical method for the
detection of neurotransmitters released from and stored inside cells. An
important aspect of the amperometry method is the sub-millisecond temporal
resolution of the current recordings which leads to several hundreds of
gigabytes of high-quality data. In this study, we present a universal method
for the classification with respect to diverse amperometric datasets using
data-driven approaches in computational science. We demonstrate a very high
prediction accuracy (greater than or equal to 95%). This includes an end-to-end
systematic machine learning workflow for amperometric time series datasets
consisting of pre-processing; feature extraction; model identification;
training and testing; followed by feature importance evaluation - all
implemented. We tested the method on heterogeneous amperometric time series
datasets generated using different experimental approaches, chemical
stimulations, electrode types, and varying recording times. We identified a
certain overarching set of common features across these datasets which enables
accurate predictions. Further, we showed that information relevant for the
classification of amperometric traces are neither in the spiky segments alone,
nor can it be retrieved from just the temporal structure of spikes. In fact,
the transients between spikes and the trace baselines carry essential
information for a successful classification, thereby strongly demonstrating
that an effective feature representation of amperometric time series requires
the full time series. To our knowledge, this is one of the first studies that
propose a scheme for machine learning, and in particular, supervised learning
on full amperometry time series data.