{"title":"ML-Driven Facial Synthesis from Spoken Words Using Conditional GANs","authors":"Vaishnavi Srivastava, Sakshi Srivastava, Sakshi Chauhan, Divyakshi Yadav","doi":"10.59256/ijire.20240501004","DOIUrl":"https://doi.org/10.59256/ijire.20240501004","url":null,"abstract":"A Human Brain may translate a person's voice to its corresponding face image even if never seen before. Training adeep learning network to do the same can be used in detecting human faces based on their voice, which may be used in findinga criminal that we only have a voice recording for. The goal in this paper is to build a Conditional Generative Adversarial Network that produces face images from human speeches which can then be recognized by a face recognition model to identifythe owner of the speech. The model was trained, and the face recognition model gave an accuracy of 80.08% in training and 56.2% in testing. Compared to the basic GAN model, this model has improved the results by about 30%. Key Word: Face image synthesis, Generative adversarial network, Face Recognition","PeriodicalId":516932,"journal":{"name":"International Journal of Innovative Research in Engineering","volume":"78 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140495923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yoheswari S, Adhithyaram L, Gokulesh S, Harish Raj K.B, Jivithesh Harshaa R D
{"title":"BLYNK RFID and Retinal Lock Access System","authors":"Yoheswari S, Adhithyaram L, Gokulesh S, Harish Raj K.B, Jivithesh Harshaa R D","doi":"10.59256/ijire.20240501003","DOIUrl":"https://doi.org/10.59256/ijire.20240501003","url":null,"abstract":"The BLYNK RFID AND RETINAL LOCKACCESS SYSTEM describes a digital door lock system that uses an ESP32-CAM module, which is a budget friendly development board with a very small size camera and a micro-SD card slot. The system uses retinal recognition technology to detect the retinal of the person who wants to access the door. The AI-Thinker ESP32-CAM module takes pictures of the person and sends them to the owner via the BLYNK application installed on their mobile phone. The owner can then grant permission to access the door based on the person’s identity. When deploying your BLYNK RFID and retinal scanner project, it's important to consider scalability and maintenance. As your user base and access requirements may change over time, plan for future expansion and updates. Regularly review and update your system's firmware, libraries, and security measures to stay ahead of potential vulnerabilities and evolving best practices in access control. Monitoring and auditing your system's usage is crucial. The Blynk platform can help you gather data on access attempts and system performance, allowing you to analyze the data for any anomalies and potential security breaches. This data can be valuable for compliance, troubleshooting, and performance optimization. Key Word: retinal and RFID scanning for lock to authentic users, using an ESP32-CAM and RFID reader controlling through BLYNK.","PeriodicalId":516932,"journal":{"name":"International Journal of Innovative Research in Engineering","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140497224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Custom Voice Cloner","authors":"Usharani K, Nandha kumaran H, Nikhilesh Pranav M.S, Nithish kumar K.K, Prasanna Krishna A.S","doi":"10.59256/ijire.20240501002","DOIUrl":"https://doi.org/10.59256/ijire.20240501002","url":null,"abstract":"The Custom Voice Cloner is based on voice signal speech synthesizer. It is a technology that converts text into audible speech, simulating human speech characteristics like pitch and tone. It finds applications in virtual assistants, navigation systems, and accessibility tools. Building one in Python typically involves Text-to-Speech (TTS) libraries such as gTTS, pyttsx3, or platform-specific options for Windows and macOS, offering easy text-to-speech conversion.However, TTS libraries might lack customization and voice quality needed for advanced projects. For more sophisticated applications, custom voice synthesizers can be built using deep learning techniques like Tacotron and WaveNet. These models learn speech nuances for more natural output.Creating a custom voice synthesizer is challenging, requiring high-quality training data, machine learning expertise, and substantial computational resources. It goes beyond generating speech to convey emotions and nuances in pronunciation for natural and expressive voices. Key Word: Voice signal speech synthesizer,text-to-speech conversion, deep learning,TTS, gTTS, pyttsx3,etc.","PeriodicalId":516932,"journal":{"name":"International Journal of Innovative Research in Engineering","volume":"427 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140502508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maria Sobana S, R. M, Rajkumar R K, Rajkumar M, Siddarthan S
{"title":"Embedding Artificial Intelligence for Personal Voice Assistant Using NLP","authors":"Maria Sobana S, R. M, Rajkumar R K, Rajkumar M, Siddarthan S","doi":"10.59256/ijire.20240501001","DOIUrl":"https://doi.org/10.59256/ijire.20240501001","url":null,"abstract":"The voice assistance is an software which is able to provide a detailed response as a voice based output according to an instruction in a prompt. To seamless integration of quick responses to queries and up-to-date weather information enhances daily routines, promoting efficiency and convenience. To achieve these capabilities, technologies like NLTK, pyttsx3, and speech recognition libraries play a pivotal role. To summarize, the convergence of these tools is gradually transforming the futuristic concept of an indispensable personal assistant into an attainable reality. AI technologies have revolutionized digital assistant interactions, but as they integrate into daily life, addressing bias, ambiguity, and ethics becomes crucial. Key Word: Integration; Convergence; Futuristic; Indispensable;","PeriodicalId":516932,"journal":{"name":"International Journal of Innovative Research in Engineering","volume":"17 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140513048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}