{"title":"使用YOLO和(Darknet-19)卷积神经网络的实时手势识别","authors":"Raad Ahmed Mohamed, Karim Q Hussein","doi":"10.11113/ijic.v13n1-2.422","DOIUrl":null,"url":null,"abstract":"There are at least three hundred and fifty million people in the world that cannot hear or speak. These are what are called deaf and dumb. Often this segment of society is partially isolated from the rest of society due to the difficulty of dealing, communicating and understanding between this segment and the rest of the healthy society. As a result of this problem, a number of solutions have been proposed that attempt to bridge this gap between this segment and the rest of society. The main reason for this is to simplify the understanding of sign language. The basic idea is building program to recognize the hand movement of the interlocutor and convert it from images to symbols or letters found in the dictionary of the deaf and dumb. This process itself follows mainly the applications of artificial intelligence, where it is important to distinguish, identify, and extract the palm of the hand from the regular images received by the camera device, and then convert this image of the movement of the paws or hands into understandable symbols. In this paper, the method of image processing and artificial intelligence, represented by the use of artificial neural networks after synthesizing the problem under research was used. Scanning the image to determine the areas of the right and left palm. Non-traditional methods that use artificial intelligence like Convolutional Neural Networks are used to fulfill this part. YOLO V-2 specifically was used in the current research with excellent results. Part Two: Building a pictorial dictionary of the letters used in teaching the deaf and dumb, after generating the image database for the dictionary, neural network Dark NET-19 were used to identify (classification) the images of characters extracted from the first part of the program. The results obtained from the research show that the use of neural networks, especially convolution neural networks, is very suitable in terms of accuracy, speed of performance, and generality in processing the previously unused input data. Many of the limitations associated with using such a program without specifying specific shapes (general shape) and templates, hand shape, hand speed, hand color and other physical expressions and without using any other physical aids were overcome through the optimal use of artificial convolution neural networks.","PeriodicalId":50314,"journal":{"name":"International Journal of Innovative Computing Information and Control","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Real-Time Hand Gesture Recognition Using YOLO and (Darknet-19) Convolution Neural Networks\",\"authors\":\"Raad Ahmed Mohamed, Karim Q Hussein\",\"doi\":\"10.11113/ijic.v13n1-2.422\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There are at least three hundred and fifty million people in the world that cannot hear or speak. These are what are called deaf and dumb. Often this segment of society is partially isolated from the rest of society due to the difficulty of dealing, communicating and understanding between this segment and the rest of the healthy society. As a result of this problem, a number of solutions have been proposed that attempt to bridge this gap between this segment and the rest of society. The main reason for this is to simplify the understanding of sign language. The basic idea is building program to recognize the hand movement of the interlocutor and convert it from images to symbols or letters found in the dictionary of the deaf and dumb. This process itself follows mainly the applications of artificial intelligence, where it is important to distinguish, identify, and extract the palm of the hand from the regular images received by the camera device, and then convert this image of the movement of the paws or hands into understandable symbols. In this paper, the method of image processing and artificial intelligence, represented by the use of artificial neural networks after synthesizing the problem under research was used. Scanning the image to determine the areas of the right and left palm. Non-traditional methods that use artificial intelligence like Convolutional Neural Networks are used to fulfill this part. YOLO V-2 specifically was used in the current research with excellent results. Part Two: Building a pictorial dictionary of the letters used in teaching the deaf and dumb, after generating the image database for the dictionary, neural network Dark NET-19 were used to identify (classification) the images of characters extracted from the first part of the program. The results obtained from the research show that the use of neural networks, especially convolution neural networks, is very suitable in terms of accuracy, speed of performance, and generality in processing the previously unused input data. Many of the limitations associated with using such a program without specifying specific shapes (general shape) and templates, hand shape, hand speed, hand color and other physical expressions and without using any other physical aids were overcome through the optimal use of artificial convolution neural networks.\",\"PeriodicalId\":50314,\"journal\":{\"name\":\"International Journal of Innovative Computing Information and Control\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2023-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Innovative Computing Information and Control\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11113/ijic.v13n1-2.422\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Innovative Computing Information and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11113/ijic.v13n1-2.422","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Real-Time Hand Gesture Recognition Using YOLO and (Darknet-19) Convolution Neural Networks
There are at least three hundred and fifty million people in the world that cannot hear or speak. These are what are called deaf and dumb. Often this segment of society is partially isolated from the rest of society due to the difficulty of dealing, communicating and understanding between this segment and the rest of the healthy society. As a result of this problem, a number of solutions have been proposed that attempt to bridge this gap between this segment and the rest of society. The main reason for this is to simplify the understanding of sign language. The basic idea is building program to recognize the hand movement of the interlocutor and convert it from images to symbols or letters found in the dictionary of the deaf and dumb. This process itself follows mainly the applications of artificial intelligence, where it is important to distinguish, identify, and extract the palm of the hand from the regular images received by the camera device, and then convert this image of the movement of the paws or hands into understandable symbols. In this paper, the method of image processing and artificial intelligence, represented by the use of artificial neural networks after synthesizing the problem under research was used. Scanning the image to determine the areas of the right and left palm. Non-traditional methods that use artificial intelligence like Convolutional Neural Networks are used to fulfill this part. YOLO V-2 specifically was used in the current research with excellent results. Part Two: Building a pictorial dictionary of the letters used in teaching the deaf and dumb, after generating the image database for the dictionary, neural network Dark NET-19 were used to identify (classification) the images of characters extracted from the first part of the program. The results obtained from the research show that the use of neural networks, especially convolution neural networks, is very suitable in terms of accuracy, speed of performance, and generality in processing the previously unused input data. Many of the limitations associated with using such a program without specifying specific shapes (general shape) and templates, hand shape, hand speed, hand color and other physical expressions and without using any other physical aids were overcome through the optimal use of artificial convolution neural networks.
期刊介绍:
The primary aim of the International Journal of Innovative Computing, Information and Control (IJICIC) is to publish high-quality papers of new developments and trends, novel techniques and approaches, innovative methodologies and technologies on the theory and applications of intelligent systems, information and control. The IJICIC is a peer-reviewed English language journal and is published bimonthly