文本识别和机器学习:为受损机器人和人类

Brandi S. Goddard, Nadia Gifford, Rafiq Ahmad, Mario Soriano Morales
{"title":"文本识别和机器学习:为受损机器人和人类","authors":"Brandi S. Goddard, Nadia Gifford, Rafiq Ahmad, Mario Soriano Morales","doi":"10.29173/aar42","DOIUrl":null,"url":null,"abstract":"As robots and machines become more reliable, developing tools that utilize their potential in manufacturing and beyond is an important step being addressed by many, including the LIMDA team at the University of Alberta. A common and effective means to improve artificial performance is through optical character recognition methods. Within the category of artificial intelligence under classification machine learning, research has focussed on the benefits of convolutional neural networks (CNN) and the improvement provided compared to its parent method, neural networks. Neural networks serious flaw comes from memorization and the lack of learning about what the images contain, while CNN's combat those issues. CNN’s are designed to connect information received by the network and begins to closely mimic how humans experience learns. Using the programming language Python and machine learning libraries such as Tensorflow and Keras, different versions of CNN’s were tested against datasets containing low-resolution images with handwritten characters. The first two CNN’s were trained against the MNIST database against digits 0 through 9. The results from these tests illustrated the benefits of elements like max-pooling and the addition of convolutional layers. Taking that knowledge a final CNN was written to prove the accuracy of the algorithm against alphabet characters. After training and testings were complete the network showed an average 99.34% accuracy and 2.23% to the loss function. Time-consuming training epochs that don’t wield higher or more impressive results could also be eliminated. These and similar CNN’s have proven to yield positive results and in future research could be implemented into the laboratory to improve safety. Continuing to develop this work will lead to better translators for language, solid text to digital text transformation, and aides for the visual and speech impaired.","PeriodicalId":239812,"journal":{"name":"Alberta Academic Review","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Text Recognition and Machine Learning: For Impaired Robots and Humans\",\"authors\":\"Brandi S. Goddard, Nadia Gifford, Rafiq Ahmad, Mario Soriano Morales\",\"doi\":\"10.29173/aar42\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As robots and machines become more reliable, developing tools that utilize their potential in manufacturing and beyond is an important step being addressed by many, including the LIMDA team at the University of Alberta. A common and effective means to improve artificial performance is through optical character recognition methods. Within the category of artificial intelligence under classification machine learning, research has focussed on the benefits of convolutional neural networks (CNN) and the improvement provided compared to its parent method, neural networks. Neural networks serious flaw comes from memorization and the lack of learning about what the images contain, while CNN's combat those issues. CNN’s are designed to connect information received by the network and begins to closely mimic how humans experience learns. Using the programming language Python and machine learning libraries such as Tensorflow and Keras, different versions of CNN’s were tested against datasets containing low-resolution images with handwritten characters. The first two CNN’s were trained against the MNIST database against digits 0 through 9. The results from these tests illustrated the benefits of elements like max-pooling and the addition of convolutional layers. Taking that knowledge a final CNN was written to prove the accuracy of the algorithm against alphabet characters. After training and testings were complete the network showed an average 99.34% accuracy and 2.23% to the loss function. Time-consuming training epochs that don’t wield higher or more impressive results could also be eliminated. These and similar CNN’s have proven to yield positive results and in future research could be implemented into the laboratory to improve safety. Continuing to develop this work will lead to better translators for language, solid text to digital text transformation, and aides for the visual and speech impaired.\",\"PeriodicalId\":239812,\"journal\":{\"name\":\"Alberta Academic Review\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Alberta Academic Review\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.29173/aar42\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Alberta Academic Review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29173/aar42","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

随着机器人和机器变得越来越可靠,开发工具来利用它们在制造业和其他领域的潜力是许多人正在解决的重要步骤,包括阿尔伯塔大学的LIMDA团队。光学字符识别是提高人工性能的一种常用而有效的手段。在分类机器学习下的人工智能范畴内,研究主要集中在卷积神经网络(CNN)的好处以及与其母方法神经网络相比所提供的改进。神经网络的严重缺陷来自记忆和缺乏对图像内容的学习,而CNN正在解决这些问题。CNN的设计目的是连接网络接收到的信息,并开始密切模仿人类的经验学习方式。使用编程语言Python和机器学习库(如Tensorflow和Keras),不同版本的CNN在包含手写字符的低分辨率图像的数据集上进行了测试。前两个CNN是根据MNIST数据库对数字0到9进行训练的。这些测试的结果说明了像最大池化和增加卷积层这样的元素的好处。利用这些知识,最终的CNN被编写出来,以证明该算法对字母字符的准确性。训练和测试完成后,网络的平均准确率为99.34%,对损失函数的平均准确率为2.23%。那些不能产生更高或更令人印象深刻的结果的耗时训练时期也可以被取消。这些和类似的CNN已经被证明产生了积极的结果,在未来的研究中可以在实验室中实施,以提高安全性。继续发展这项工作将带来更好的语言翻译,实体文本到数字文本的转换,以及视觉和语言障碍的助手。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Text Recognition and Machine Learning: For Impaired Robots and Humans
As robots and machines become more reliable, developing tools that utilize their potential in manufacturing and beyond is an important step being addressed by many, including the LIMDA team at the University of Alberta. A common and effective means to improve artificial performance is through optical character recognition methods. Within the category of artificial intelligence under classification machine learning, research has focussed on the benefits of convolutional neural networks (CNN) and the improvement provided compared to its parent method, neural networks. Neural networks serious flaw comes from memorization and the lack of learning about what the images contain, while CNN's combat those issues. CNN’s are designed to connect information received by the network and begins to closely mimic how humans experience learns. Using the programming language Python and machine learning libraries such as Tensorflow and Keras, different versions of CNN’s were tested against datasets containing low-resolution images with handwritten characters. The first two CNN’s were trained against the MNIST database against digits 0 through 9. The results from these tests illustrated the benefits of elements like max-pooling and the addition of convolutional layers. Taking that knowledge a final CNN was written to prove the accuracy of the algorithm against alphabet characters. After training and testings were complete the network showed an average 99.34% accuracy and 2.23% to the loss function. Time-consuming training epochs that don’t wield higher or more impressive results could also be eliminated. These and similar CNN’s have proven to yield positive results and in future research could be implemented into the laboratory to improve safety. Continuing to develop this work will lead to better translators for language, solid text to digital text transformation, and aides for the visual and speech impaired.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信