Mahak Goindani, Qiaoling Liu, Josh Chao, V. Jijkoun
{"title":"雇主行业分类使用职位公告","authors":"Mahak Goindani, Qiaoling Liu, Josh Chao, V. Jijkoun","doi":"10.1109/ICDMW.2017.30","DOIUrl":null,"url":null,"abstract":"In the recruitment domain, knowing the employer industry of jobs is important to get an insight about the demand in each industry. The existing system at CareerBuilder uses an employer name normalization system and an employer knowledge base to infer the employer industry of a job. However, errors may occur during the computation of the job employer and in the construction of the employer knowledge base with the industry attributes. Since the knowledge base is huge, it is not possible to manually detect the errors. Therefore, in this paper we use Machine Learning techniques to automatically detect the errors. With the observation that the main jobs posted by an employer often relate to the employer industry, e.g., truck driver jobs often correspond to employers belonging to the transportation industry, we develop a system that classifies the industry of an employer using job posting data. We aggregate job postings from an employer and use job titles and employer names as features for predicting the industry of the employer. We used two models for classification: (1) Support Vector Machine, and (2) Gradient Boosted Decision Trees, and observed that while both the models perform similarly in classifying job employers that were correctly computed, GBDT is more effective than SVM in identifying job employers that were wrongly computed. We also show the utility of our system in detecting normalization errors and knowledge base errors.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Employer Industry Classification Using Job Postings\",\"authors\":\"Mahak Goindani, Qiaoling Liu, Josh Chao, V. Jijkoun\",\"doi\":\"10.1109/ICDMW.2017.30\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the recruitment domain, knowing the employer industry of jobs is important to get an insight about the demand in each industry. The existing system at CareerBuilder uses an employer name normalization system and an employer knowledge base to infer the employer industry of a job. However, errors may occur during the computation of the job employer and in the construction of the employer knowledge base with the industry attributes. Since the knowledge base is huge, it is not possible to manually detect the errors. Therefore, in this paper we use Machine Learning techniques to automatically detect the errors. With the observation that the main jobs posted by an employer often relate to the employer industry, e.g., truck driver jobs often correspond to employers belonging to the transportation industry, we develop a system that classifies the industry of an employer using job posting data. We aggregate job postings from an employer and use job titles and employer names as features for predicting the industry of the employer. We used two models for classification: (1) Support Vector Machine, and (2) Gradient Boosted Decision Trees, and observed that while both the models perform similarly in classifying job employers that were correctly computed, GBDT is more effective than SVM in identifying job employers that were wrongly computed. We also show the utility of our system in detecting normalization errors and knowledge base errors.\",\"PeriodicalId\":389183,\"journal\":{\"name\":\"2017 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW.2017.30\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2017.30","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Employer Industry Classification Using Job Postings
In the recruitment domain, knowing the employer industry of jobs is important to get an insight about the demand in each industry. The existing system at CareerBuilder uses an employer name normalization system and an employer knowledge base to infer the employer industry of a job. However, errors may occur during the computation of the job employer and in the construction of the employer knowledge base with the industry attributes. Since the knowledge base is huge, it is not possible to manually detect the errors. Therefore, in this paper we use Machine Learning techniques to automatically detect the errors. With the observation that the main jobs posted by an employer often relate to the employer industry, e.g., truck driver jobs often correspond to employers belonging to the transportation industry, we develop a system that classifies the industry of an employer using job posting data. We aggregate job postings from an employer and use job titles and employer names as features for predicting the industry of the employer. We used two models for classification: (1) Support Vector Machine, and (2) Gradient Boosted Decision Trees, and observed that while both the models perform similarly in classifying job employers that were correctly computed, GBDT is more effective than SVM in identifying job employers that were wrongly computed. We also show the utility of our system in detecting normalization errors and knowledge base errors.