{"title":"使用Unet和SqueezeNet级联的哺乳动物物种检测","authors":"Michael Njeru, C. Maina, K. Langat","doi":"10.1109/africon51333.2021.9570950","DOIUrl":null,"url":null,"abstract":"Monitoring of wild animals has taken different approaches with an aim to provide vital information used in animal protection in their natural habitats. To recognize animal species without human trackers requires machine learning models that extract specie's features from an image. This project proposes a method of counting animals in an image and specifying the species of each animal using Unet and a variant of the SqueezeNet model. To train the Unet model, images and corresponding masks are used as the training data. Different optimizers are applied to each model. During inference, Unet outputs a binary mask with ones where an animal is detected and zeros elsewhere. SqueezeNet model is trained with images corresponding to six classes: bushbuck, impala, llama, warthog, waterbuck, and zebra. Three variants of the SqueezeNet model have been trained. The first contains the original backbone while the other two have the original backbone with an additional fire module. In one model the Fire module is similar to the Fire modules of the original backbone while in the other model, the extra fire module contains batch normalization layers. The trained models show that Unet trained with Nadam optimizer achieves the highest dice coefficient while the SqueezeNet with an extra Fire module containing batch norm layers and RMSprop optimizer achieves the highest training accuracy. The combined system containing the two models takes an image and outputs the image with bounding boxes around each animal and the corresponding animal species. The system achieves both counting and recognition of the species for each image placed at the input.","PeriodicalId":170342,"journal":{"name":"2021 IEEE AFRICON","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mammalian Species Detection Using a Cascade of Unet and SqueezeNet\",\"authors\":\"Michael Njeru, C. Maina, K. Langat\",\"doi\":\"10.1109/africon51333.2021.9570950\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Monitoring of wild animals has taken different approaches with an aim to provide vital information used in animal protection in their natural habitats. To recognize animal species without human trackers requires machine learning models that extract specie's features from an image. This project proposes a method of counting animals in an image and specifying the species of each animal using Unet and a variant of the SqueezeNet model. To train the Unet model, images and corresponding masks are used as the training data. Different optimizers are applied to each model. During inference, Unet outputs a binary mask with ones where an animal is detected and zeros elsewhere. SqueezeNet model is trained with images corresponding to six classes: bushbuck, impala, llama, warthog, waterbuck, and zebra. Three variants of the SqueezeNet model have been trained. The first contains the original backbone while the other two have the original backbone with an additional fire module. In one model the Fire module is similar to the Fire modules of the original backbone while in the other model, the extra fire module contains batch normalization layers. The trained models show that Unet trained with Nadam optimizer achieves the highest dice coefficient while the SqueezeNet with an extra Fire module containing batch norm layers and RMSprop optimizer achieves the highest training accuracy. The combined system containing the two models takes an image and outputs the image with bounding boxes around each animal and the corresponding animal species. The system achieves both counting and recognition of the species for each image placed at the input.\",\"PeriodicalId\":170342,\"journal\":{\"name\":\"2021 IEEE AFRICON\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE AFRICON\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/africon51333.2021.9570950\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE AFRICON","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/africon51333.2021.9570950","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Mammalian Species Detection Using a Cascade of Unet and SqueezeNet
Monitoring of wild animals has taken different approaches with an aim to provide vital information used in animal protection in their natural habitats. To recognize animal species without human trackers requires machine learning models that extract specie's features from an image. This project proposes a method of counting animals in an image and specifying the species of each animal using Unet and a variant of the SqueezeNet model. To train the Unet model, images and corresponding masks are used as the training data. Different optimizers are applied to each model. During inference, Unet outputs a binary mask with ones where an animal is detected and zeros elsewhere. SqueezeNet model is trained with images corresponding to six classes: bushbuck, impala, llama, warthog, waterbuck, and zebra. Three variants of the SqueezeNet model have been trained. The first contains the original backbone while the other two have the original backbone with an additional fire module. In one model the Fire module is similar to the Fire modules of the original backbone while in the other model, the extra fire module contains batch normalization layers. The trained models show that Unet trained with Nadam optimizer achieves the highest dice coefficient while the SqueezeNet with an extra Fire module containing batch norm layers and RMSprop optimizer achieves the highest training accuracy. The combined system containing the two models takes an image and outputs the image with bounding boxes around each animal and the corresponding animal species. The system achieves both counting and recognition of the species for each image placed at the input.