{"title":"玛雅楔形符号的形状表示:知识驱动还是深度驱动?","authors":"G. Can, J. Odobez, D. Gática-Pérez","doi":"10.1145/3095713.3095746","DOIUrl":null,"url":null,"abstract":"This paper investigates two-types of shape representations for individual Maya codical glyphs: traditional bag-of-words built on knowledge-driven local shape descriptors (HOOSC), and Convolutional Neural Networks (CNN) based representations, learned from data. For CNN representations, first, we evaluate the activations of typical CNNs that are pretrained on large-scale image datasets; second, we train a CNN from scratch with all the available individual segments. One of the main challenges while training CNNs is the limited amount of available data (and handling data imbalance issue). Here, we attempt to solve this imbalance issue by introducing class-weights into the loss computation during training. Another possibility is oversampling the minority class samples during batch selection. We show that deep representations outperform the other, but CNN training requires special care for small-scale unbalanced data, that is usually the case in the cultural heritage domain.","PeriodicalId":310224,"journal":{"name":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","volume":"238 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Shape Representations for Maya Codical Glyphs: Knowledge-driven or Deep?\",\"authors\":\"G. Can, J. Odobez, D. Gática-Pérez\",\"doi\":\"10.1145/3095713.3095746\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper investigates two-types of shape representations for individual Maya codical glyphs: traditional bag-of-words built on knowledge-driven local shape descriptors (HOOSC), and Convolutional Neural Networks (CNN) based representations, learned from data. For CNN representations, first, we evaluate the activations of typical CNNs that are pretrained on large-scale image datasets; second, we train a CNN from scratch with all the available individual segments. One of the main challenges while training CNNs is the limited amount of available data (and handling data imbalance issue). Here, we attempt to solve this imbalance issue by introducing class-weights into the loss computation during training. Another possibility is oversampling the minority class samples during batch selection. We show that deep representations outperform the other, but CNN training requires special care for small-scale unbalanced data, that is usually the case in the cultural heritage domain.\",\"PeriodicalId\":310224,\"journal\":{\"name\":\"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing\",\"volume\":\"238 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3095713.3095746\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3095713.3095746","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Shape Representations for Maya Codical Glyphs: Knowledge-driven or Deep?
This paper investigates two-types of shape representations for individual Maya codical glyphs: traditional bag-of-words built on knowledge-driven local shape descriptors (HOOSC), and Convolutional Neural Networks (CNN) based representations, learned from data. For CNN representations, first, we evaluate the activations of typical CNNs that are pretrained on large-scale image datasets; second, we train a CNN from scratch with all the available individual segments. One of the main challenges while training CNNs is the limited amount of available data (and handling data imbalance issue). Here, we attempt to solve this imbalance issue by introducing class-weights into the loss computation during training. Another possibility is oversampling the minority class samples during batch selection. We show that deep representations outperform the other, but CNN training requires special care for small-scale unbalanced data, that is usually the case in the cultural heritage domain.