L. Weigang, L. Martins, Nikson Ferreira, Christian Miranda, Lucas S. Althoff, Walner Pessoa, Mylène C. Q. Farias, Ricardo Jacobi, Mauricio Rincon
{"title":"Heuristic Once Learning for Image & Text Duality Information Processing","authors":"L. Weigang, L. Martins, Nikson Ferreira, Christian Miranda, Lucas S. Althoff, Walner Pessoa, Mylène C. Q. Farias, Ricardo Jacobi, Mauricio Rincon","doi":"10.1109/SmartWorld-UIC-ATC-ScalCom-DigitalTwin-PriComp-Metaverse56740.2022.00195","DOIUrl":null,"url":null,"abstract":"Few-shot learning is an important mechanism to minimize the need for the labeling of large amounts of data and taking advantage of transfer learning. To identify image/text input with duality property, this research proposes a “Heuristic once learning (HOL)” mechanism to investigate multi-modal input processing similar to human-like behavior. First, we create an image/text data set of big Latin letters composed of small letters and another data set composed of Arabic, Chinese and Roman numerals. Secondly, we use Convolutional Neural Networks (CNN) for pre-training the dataset of letters to get structural features. Thirdly, using the acquired knowledge, a Self-organizing Map (SOM) and Contrastive Language-Image Pretraining (CLIP) are tested separately using zero-shot learning. Siamese Networks and Vision Transformer (ViT) are also tested using one-shot learning by knowledge transfer to identify the features of unknown characters. The research results show the potential and challenges to realize HOL and make a useful attempt for the development of general agents.","PeriodicalId":43791,"journal":{"name":"Scalable Computing-Practice and Experience","volume":null,"pages":null},"PeriodicalIF":0.9000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scalable Computing-Practice and Experience","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SmartWorld-UIC-ATC-ScalCom-DigitalTwin-PriComp-Metaverse56740.2022.00195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Few-shot learning is an important mechanism to minimize the need for the labeling of large amounts of data and taking advantage of transfer learning. To identify image/text input with duality property, this research proposes a “Heuristic once learning (HOL)” mechanism to investigate multi-modal input processing similar to human-like behavior. First, we create an image/text data set of big Latin letters composed of small letters and another data set composed of Arabic, Chinese and Roman numerals. Secondly, we use Convolutional Neural Networks (CNN) for pre-training the dataset of letters to get structural features. Thirdly, using the acquired knowledge, a Self-organizing Map (SOM) and Contrastive Language-Image Pretraining (CLIP) are tested separately using zero-shot learning. Siamese Networks and Vision Transformer (ViT) are also tested using one-shot learning by knowledge transfer to identify the features of unknown characters. The research results show the potential and challenges to realize HOL and make a useful attempt for the development of general agents.
期刊介绍:
The area of scalable computing has matured and reached a point where new issues and trends require a professional forum. SCPE will provide this avenue by publishing original refereed papers that address the present as well as the future of parallel and distributed computing. The journal will focus on algorithm development, implementation and execution on real-world parallel architectures, and application of parallel and distributed computing to the solution of real-life problems.