{"title":"高能物理实验数据分析中粒子射流特征的自动选择","authors":"A. Luca, F. Follega, M. Cristoforetti, R. Iuppa","doi":"10.22323/1.390.0907","DOIUrl":null,"url":null,"abstract":"We show that it is possible to reduce the size of a classification problem by automatically ranking the relative importance of available features. Variables are importance-sorted with a decision tree algorithm and correlated ones are removed after ranking. The selected features can be used as input quantities for the classification problem at hand. We tested the method with the case of highly boosted di-jet resonances decaying to two 1-quarks, to be selected against an overwhelming QCD background with a Deep Neural network. We make it explicit the relation between different importance rankings obtained with different algorithms. We also show how the signal-to-background ratio changes, varying the number of features to feed the Neural Network with.","PeriodicalId":20428,"journal":{"name":"Proceedings of 40th International Conference on High Energy physics — PoS(ICHEP2020)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Automated selection of particle-jet features for data analysis in High Energy Physics experiments\",\"authors\":\"A. Luca, F. Follega, M. Cristoforetti, R. Iuppa\",\"doi\":\"10.22323/1.390.0907\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We show that it is possible to reduce the size of a classification problem by automatically ranking the relative importance of available features. Variables are importance-sorted with a decision tree algorithm and correlated ones are removed after ranking. The selected features can be used as input quantities for the classification problem at hand. We tested the method with the case of highly boosted di-jet resonances decaying to two 1-quarks, to be selected against an overwhelming QCD background with a Deep Neural network. We make it explicit the relation between different importance rankings obtained with different algorithms. We also show how the signal-to-background ratio changes, varying the number of features to feed the Neural Network with.\",\"PeriodicalId\":20428,\"journal\":{\"name\":\"Proceedings of 40th International Conference on High Energy physics — PoS(ICHEP2020)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 40th International Conference on High Energy physics — PoS(ICHEP2020)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22323/1.390.0907\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 40th International Conference on High Energy physics — PoS(ICHEP2020)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22323/1.390.0907","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automated selection of particle-jet features for data analysis in High Energy Physics experiments
We show that it is possible to reduce the size of a classification problem by automatically ranking the relative importance of available features. Variables are importance-sorted with a decision tree algorithm and correlated ones are removed after ranking. The selected features can be used as input quantities for the classification problem at hand. We tested the method with the case of highly boosted di-jet resonances decaying to two 1-quarks, to be selected against an overwhelming QCD background with a Deep Neural network. We make it explicit the relation between different importance rankings obtained with different algorithms. We also show how the signal-to-background ratio changes, varying the number of features to feed the Neural Network with.