Anthony B Garza, Rolando Garcia, Marc S Halfon, Hani Z Girgis
{"title":"度量和表征学习方法的评估:相对距离驱动的表征对表现的影响。","authors":"Anthony B Garza, Rolando Garcia, Marc S Halfon, Hani Z Girgis","doi":"10.1109/imsa58542.2023.10217475","DOIUrl":null,"url":null,"abstract":"<p><p>Several deep neural network architectures have emerged recently for metric learning. We asked which architecture is the most effective in measuring the similarity or dissimilarity among images. To this end, we evaluated six networks on a standard image set. We evaluated variational autoencoders, Siamese networks, triplet networks, and variational auto-encoders combined with Siamese or triplet networks. These networks were compared to a baseline network consisting of multiple separable convolutional layers. Our study revealed the following: (i) the triplet architecture proved the most effective one due to learning a relative distance - not an absolute distance; (ii) combining auto-encoders with networks that learn metrics (e.g., Siamese or triplet networks) is unwarranted; and (iii) an architecture based on separable convolutional layers is a reasonable simple alternative to triplet networks. These results can potentially impact our field by encouraging architects to develop advanced networks that take advantage of separable convolution and relative distance.</p>","PeriodicalId":94364,"journal":{"name":"2023 Intelligent Methods, Systems, and Applications","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10566582/pdf/nihms-1935619.pdf","citationCount":"0","resultStr":"{\"title\":\"Evaluation of metric and representation learning approaches: Effects of representations driven by relative distance on the performance.\",\"authors\":\"Anthony B Garza, Rolando Garcia, Marc S Halfon, Hani Z Girgis\",\"doi\":\"10.1109/imsa58542.2023.10217475\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Several deep neural network architectures have emerged recently for metric learning. We asked which architecture is the most effective in measuring the similarity or dissimilarity among images. To this end, we evaluated six networks on a standard image set. We evaluated variational autoencoders, Siamese networks, triplet networks, and variational auto-encoders combined with Siamese or triplet networks. These networks were compared to a baseline network consisting of multiple separable convolutional layers. Our study revealed the following: (i) the triplet architecture proved the most effective one due to learning a relative distance - not an absolute distance; (ii) combining auto-encoders with networks that learn metrics (e.g., Siamese or triplet networks) is unwarranted; and (iii) an architecture based on separable convolutional layers is a reasonable simple alternative to triplet networks. These results can potentially impact our field by encouraging architects to develop advanced networks that take advantage of separable convolution and relative distance.</p>\",\"PeriodicalId\":94364,\"journal\":{\"name\":\"2023 Intelligent Methods, Systems, and Applications\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10566582/pdf/nihms-1935619.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 Intelligent Methods, Systems, and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/imsa58542.2023.10217475\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/8/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 Intelligent Methods, Systems, and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/imsa58542.2023.10217475","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/8/24 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
Evaluation of metric and representation learning approaches: Effects of representations driven by relative distance on the performance.
Several deep neural network architectures have emerged recently for metric learning. We asked which architecture is the most effective in measuring the similarity or dissimilarity among images. To this end, we evaluated six networks on a standard image set. We evaluated variational autoencoders, Siamese networks, triplet networks, and variational auto-encoders combined with Siamese or triplet networks. These networks were compared to a baseline network consisting of multiple separable convolutional layers. Our study revealed the following: (i) the triplet architecture proved the most effective one due to learning a relative distance - not an absolute distance; (ii) combining auto-encoders with networks that learn metrics (e.g., Siamese or triplet networks) is unwarranted; and (iii) an architecture based on separable convolutional layers is a reasonable simple alternative to triplet networks. These results can potentially impact our field by encouraging architects to develop advanced networks that take advantage of separable convolution and relative distance.