{"title":"Efficient Protocol for Gender Identification using Machine Learning","authors":"Anupama Mishra, A. K. Daniel","doi":"10.2139/ssrn.3562888","DOIUrl":null,"url":null,"abstract":"Gender identification of names is an important task to identify human beings. Gender identification uses many attributes as voice-based gender prediction, face-based, and many other attributes. The Natural Language Processing (NLP) is a technique which can identify easily and accurately. These identification problems can be classified through various techniques. The binary classification of gender is considered. The proposed model consists of Naive Bayes (NB), Decision Tree (DT), and Support Vector Machine (SVM) concepts for identifying the gender. The gender names an important key for identifying the gender based on the last character(s) of consonant/vowel features. The model supports the unigram, bigram, trigram, four-gram, and vowels postfix technique to identify the gender. The simulation result shows a better performance under unigram and bigram model compare to others.","PeriodicalId":11974,"journal":{"name":"EngRN: Engineering Design Process (Topic)","volume":"18 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EngRN: Engineering Design Process (Topic)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3562888","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Gender identification of names is an important task to identify human beings. Gender identification uses many attributes as voice-based gender prediction, face-based, and many other attributes. The Natural Language Processing (NLP) is a technique which can identify easily and accurately. These identification problems can be classified through various techniques. The binary classification of gender is considered. The proposed model consists of Naive Bayes (NB), Decision Tree (DT), and Support Vector Machine (SVM) concepts for identifying the gender. The gender names an important key for identifying the gender based on the last character(s) of consonant/vowel features. The model supports the unigram, bigram, trigram, four-gram, and vowels postfix technique to identify the gender. The simulation result shows a better performance under unigram and bigram model compare to others.