Towards a minimum description length based stopping criterion for semi-supervised time series classification

2013 IEEE 14th International Conference on Information Reuse & Integration (IRI) Pub Date : 2013-10-24 DOI:10.1109/IRI.2013.6642490

Nurjahan Begum, Bing Hu, T. Rakthanmanon, Eamonn J. Keogh

{"title":"Towards a minimum description length based stopping criterion for semi-supervised time series classification","authors":"Nurjahan Begum, Bing Hu, T. Rakthanmanon, Eamonn J. Keogh","doi":"10.1109/IRI.2013.6642490","DOIUrl":null,"url":null,"abstract":"In the last decade the plunging costs of sensors/storage have made it possible to obtain vast amounts of medical telemetry. However for this data to be useful, it must be annotated. This annotation, requiring the attention of medical experts is very expensive and time consuming, and remains the critical bottleneck in medical analysis. Semi-supervised learning is an obvious way to mitigate the need for human labor, however, most such algorithms are designed for intrinsically discrete objects, and do not work well in this domain, which requires the ability to deal with real-valued objects arriving in a streaming fashion. In this work we make two contributions. First, we demonstrate that in many cases just a handful of human annotated examples are sufficient to perform accurate classification. Second, we devise a novel parameter-free stopping criterion for semi-supervised learning. We evaluate our work with a comprehensive set of experiments on diverse medical data sources including electrocardiograms. Our experimental results show that our approach can construct accurate classifiers even if given only a single annotated instance.","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2013.6642490","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 21

Abstract

In the last decade the plunging costs of sensors/storage have made it possible to obtain vast amounts of medical telemetry. However for this data to be useful, it must be annotated. This annotation, requiring the attention of medical experts is very expensive and time consuming, and remains the critical bottleneck in medical analysis. Semi-supervised learning is an obvious way to mitigate the need for human labor, however, most such algorithms are designed for intrinsically discrete objects, and do not work well in this domain, which requires the ability to deal with real-valued objects arriving in a streaming fashion. In this work we make two contributions. First, we demonstrate that in many cases just a handful of human annotated examples are sufficient to perform accurate classification. Second, we devise a novel parameter-free stopping criterion for semi-supervised learning. We evaluate our work with a comprehensive set of experiments on diverse medical data sources including electrocardiograms. Our experimental results show that our approach can construct accurate classifiers even if given only a single annotated instance.

查看原文本刊更多论文

基于最小描述长度的半监督时间序列分类停止准则研究

在过去十年中，传感器/存储成本的大幅下降使得获得大量的医疗遥测成为可能。但是，为了使这些数据有用，必须对其进行注释。这种需要医学专家关注的注释是非常昂贵和耗时的，仍然是医学分析的关键瓶颈。半监督学习是减轻对人力需求的一种明显方法，然而，大多数此类算法是为本质上离散的对象设计的，并且在这个领域不能很好地工作，这需要处理以流方式到达的实值对象的能力。在这项工作中，我们有两个贡献。首先，我们证明了在许多情况下，仅少量人类注释的示例就足以执行准确的分类。其次，我们设计了一种新的半监督学习的无参数停止准则。我们评估我们的工作与一套全面的实验在不同的医学数据来源，包括心电图。实验结果表明，即使只给出单个注释实例，我们的方法也可以构建准确的分类器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)

自引率

0.00%

发文量