{"title":"使用主动深度学习对临床试验中的资格标准进行分类","authors":"C. Chuan","doi":"10.1109/ICMLA.2018.00052","DOIUrl":null,"url":null,"abstract":"In this paper we propose an active deep learning approach to automatically classify eligibility criteria of clinical trials, an application that has not been explored in machine learning. We collected all clinical trial data from the National Cancer Institute website, and applied word2vec to learn word embeddings for eligibility criteria. Criteria encoded with word embeddings were then fed into a multi-layer convolution neural network (CNN) for classification. To overcome the challenge of non-existing class labels, we designed an active learning algorithm that uses uncertainty cluster sampling to navigate the dataset and strategically propagate obtained labels to expand the training set for CNN. Experimental results show that word2vec successfully learns meaningful embeddings in criteria data, and the active deep learning approach reports a significant lower error rate in classification than the baseline k-nearest neighbor method.","PeriodicalId":6533,"journal":{"name":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"1 1","pages":"305-310"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Classifying Eligibility Criteria in Clinical Trials Using Active Deep Learning\",\"authors\":\"C. Chuan\",\"doi\":\"10.1109/ICMLA.2018.00052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we propose an active deep learning approach to automatically classify eligibility criteria of clinical trials, an application that has not been explored in machine learning. We collected all clinical trial data from the National Cancer Institute website, and applied word2vec to learn word embeddings for eligibility criteria. Criteria encoded with word embeddings were then fed into a multi-layer convolution neural network (CNN) for classification. To overcome the challenge of non-existing class labels, we designed an active learning algorithm that uses uncertainty cluster sampling to navigate the dataset and strategically propagate obtained labels to expand the training set for CNN. Experimental results show that word2vec successfully learns meaningful embeddings in criteria data, and the active deep learning approach reports a significant lower error rate in classification than the baseline k-nearest neighbor method.\",\"PeriodicalId\":6533,\"journal\":{\"name\":\"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"1 1\",\"pages\":\"305-310\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2018.00052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2018.00052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Classifying Eligibility Criteria in Clinical Trials Using Active Deep Learning
In this paper we propose an active deep learning approach to automatically classify eligibility criteria of clinical trials, an application that has not been explored in machine learning. We collected all clinical trial data from the National Cancer Institute website, and applied word2vec to learn word embeddings for eligibility criteria. Criteria encoded with word embeddings were then fed into a multi-layer convolution neural network (CNN) for classification. To overcome the challenge of non-existing class labels, we designed an active learning algorithm that uses uncertainty cluster sampling to navigate the dataset and strategically propagate obtained labels to expand the training set for CNN. Experimental results show that word2vec successfully learns meaningful embeddings in criteria data, and the active deep learning approach reports a significant lower error rate in classification than the baseline k-nearest neighbor method.