Hany Al Ashwal, Areeg S. Abdalla, M. E. Halaby, A. Moustafa
{"title":"阿尔茨海默病数据分类的特征选择","authors":"Hany Al Ashwal, Areeg S. Abdalla, M. E. Halaby, A. Moustafa","doi":"10.1145/3378936.3378982","DOIUrl":null,"url":null,"abstract":"In this paper, we describe the features of our large dataset (6400+ rows and 400+ features) that includes Alzheimer's disease (AD) patients, individuals with mild cognitive impairment (MCI, prodromal stage of Alzheimer's disease), and healthy individuals (without AD or MCI). We also, present a feature selection method applied on the dataset. Unlike prior data mining models that were applied to AD, our dataset is big in nature and includes genetic, neural, nutritional, and cognitive measures of all the individuals. All of these measures in the data have been shown by empirical studies to be related to the development of AD. We used a random forest classifier to discover which features best classify and differentiate between AD patients and healthy individuals. Identifying these features will likely provide evidence for protective factors against the development of AD.","PeriodicalId":304149,"journal":{"name":"Proceedings of the 3rd International Conference on Software Engineering and Information Management","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Feature Selection for the Classification of Alzheimer's Disease Data\",\"authors\":\"Hany Al Ashwal, Areeg S. Abdalla, M. E. Halaby, A. Moustafa\",\"doi\":\"10.1145/3378936.3378982\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we describe the features of our large dataset (6400+ rows and 400+ features) that includes Alzheimer's disease (AD) patients, individuals with mild cognitive impairment (MCI, prodromal stage of Alzheimer's disease), and healthy individuals (without AD or MCI). We also, present a feature selection method applied on the dataset. Unlike prior data mining models that were applied to AD, our dataset is big in nature and includes genetic, neural, nutritional, and cognitive measures of all the individuals. All of these measures in the data have been shown by empirical studies to be related to the development of AD. We used a random forest classifier to discover which features best classify and differentiate between AD patients and healthy individuals. Identifying these features will likely provide evidence for protective factors against the development of AD.\",\"PeriodicalId\":304149,\"journal\":{\"name\":\"Proceedings of the 3rd International Conference on Software Engineering and Information Management\",\"volume\":\"98 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Conference on Software Engineering and Information Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3378936.3378982\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Conference on Software Engineering and Information Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3378936.3378982","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Feature Selection for the Classification of Alzheimer's Disease Data
In this paper, we describe the features of our large dataset (6400+ rows and 400+ features) that includes Alzheimer's disease (AD) patients, individuals with mild cognitive impairment (MCI, prodromal stage of Alzheimer's disease), and healthy individuals (without AD or MCI). We also, present a feature selection method applied on the dataset. Unlike prior data mining models that were applied to AD, our dataset is big in nature and includes genetic, neural, nutritional, and cognitive measures of all the individuals. All of these measures in the data have been shown by empirical studies to be related to the development of AD. We used a random forest classifier to discover which features best classify and differentiate between AD patients and healthy individuals. Identifying these features will likely provide evidence for protective factors against the development of AD.