Using the 5th dimensions of human visual perception to inspire automated edge and texture segmentation: A fuzzy spatial-taxon approach

2016 IEEE 15th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC) Pub Date : 2016-08-01 DOI:10.1109/ICCI-CC.2016.7862073

L. Barghout

{"title":"Using the 5th dimensions of human visual perception to inspire automated edge and texture segmentation: A fuzzy spatial-taxon approach","authors":"L. Barghout","doi":"10.1109/ICCI-CC.2016.7862073","DOIUrl":null,"url":null,"abstract":"With the recent stunning success of machine learning, artificially intelligent machine vision research falls (roughly) into two camps: the big data camp and cognitive informatics camp. Big data uses statistical methods to discover latent structures that emerge from the co-occurrences of relevant features when sampling over enormous quantities of data. The cognitive informatics methods design computer vision systems to mimic human cognition. Though some visual latent features that emerge from deep learning networks, mimic mammalian visual detectors, as of yet the information processing mechanisms (analogous to human psychophysical mechanisms) remain hidden within the complexity of the deep nets. Furthermore, the sampling requirements of big data systems require limiting samples to pre-processed sets, such as SHIFT (shift invariant feature transform (Lowe 1999)). Techniques, such as the ones introduced in this paper, provide fast cognitively relevant methods for selecting samples and reducing the number of candidate features. The approach described in this paper live squarely in the camp of designing computer vision A.I. to mimics human cognitive processes. I introduce a novel definition of edges, based on human hierarchical scene perception. Hierarchical scene perception views vision within the 5 dimensions of horizontal & vertical position, depth, time and scene abstraction level (spatial-taxon). Fuzzy inference selects candidate edge elements using the Gestalt psychology principal of good curvilinear continuation, proximity and edges attachment. Spatial-taxon inference infers an edge outline for each level of abstraction within the scene architecture. The system was tested on 60 natural images and the results provide edges more aligned with human intuition of what edges should look like. ROC plots indicate solid performance, with the majority of human subjects rating the edge detection as high quality. The inferred edges are consistent with the finding of neurons responsive to proto-object boundaries in the visual cortex.","PeriodicalId":135701,"journal":{"name":"2016 IEEE 15th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 15th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCI-CC.2016.7862073","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

With the recent stunning success of machine learning, artificially intelligent machine vision research falls (roughly) into two camps: the big data camp and cognitive informatics camp. Big data uses statistical methods to discover latent structures that emerge from the co-occurrences of relevant features when sampling over enormous quantities of data. The cognitive informatics methods design computer vision systems to mimic human cognition. Though some visual latent features that emerge from deep learning networks, mimic mammalian visual detectors, as of yet the information processing mechanisms (analogous to human psychophysical mechanisms) remain hidden within the complexity of the deep nets. Furthermore, the sampling requirements of big data systems require limiting samples to pre-processed sets, such as SHIFT (shift invariant feature transform (Lowe 1999)). Techniques, such as the ones introduced in this paper, provide fast cognitively relevant methods for selecting samples and reducing the number of candidate features. The approach described in this paper live squarely in the camp of designing computer vision A.I. to mimics human cognitive processes. I introduce a novel definition of edges, based on human hierarchical scene perception. Hierarchical scene perception views vision within the 5 dimensions of horizontal & vertical position, depth, time and scene abstraction level (spatial-taxon). Fuzzy inference selects candidate edge elements using the Gestalt psychology principal of good curvilinear continuation, proximity and edges attachment. Spatial-taxon inference infers an edge outline for each level of abstraction within the scene architecture. The system was tested on 60 natural images and the results provide edges more aligned with human intuition of what edges should look like. ROC plots indicate solid performance, with the majority of human subjects rating the edge detection as high quality. The inferred edges are consistent with the finding of neurons responsive to proto-object boundaries in the visual cortex.

查看原文本刊更多论文

利用人类视觉感知的第5维激发自动边缘和纹理分割:一种模糊空间分类单元方法

随着最近机器学习的惊人成功，人工智能机器视觉研究(大致)分为两个阵营:大数据阵营和认知信息学阵营。当对大量数据进行采样时，大数据使用统计方法来发现相关特征共同出现的潜在结构。认知信息学方法设计计算机视觉系统来模拟人类的认知。虽然深度学习网络中出现了一些模仿哺乳动物视觉探测器的视觉潜在特征，但到目前为止，信息处理机制(类似于人类的心理物理机制)仍然隐藏在深度网络的复杂性中。此外，大数据系统的采样要求要求将样本限制在预处理集，如SHIFT(移位不变特征变换(Lowe 1999))。本文介绍的技术为选择样本和减少候选特征的数量提供了快速的认知相关方法。本文描述的方法完全属于设计计算机视觉人工智能来模仿人类认知过程的阵营。我介绍了一种基于人类分层场景感知的边缘的新定义。分层场景感知在水平和垂直位置、深度、时间和场景抽象层次(空间分类单元)5个维度上观察视觉。模糊推理利用完形心理学的曲线延续性好、接近性好、边缘附着性好的原则来选择候选边缘元素。空间分类推断为场景架构内的每个抽象级别推断出边缘轮廓。该系统在60张自然图像上进行了测试，结果提供的边缘更符合人类对边缘的直觉。ROC图显示了可靠的性能，大多数人类受试者将边缘检测评为高质量。推断的边缘与视觉皮层中对原始物体边界有反应的神经元的发现是一致的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE 15th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC)

自引率

0.00%

发文量