{"title":"使用逻辑回归作为基于规则的牡蛎系统的扩展的实体解析","authors":"Fumiko Kobayashi, Aziz Eram, J. Talburt","doi":"10.1109/MIPR.2018.00033","DOIUrl":null,"url":null,"abstract":"This paper describes two experiments in entity resolution. In both experiments, person references were classified as \"linked\" or \"not linked\" by two different methods. The first method used an entity resolution (ER) system and employed standard \"if-then\" Boolean matching rules. The second method used the supervised machine learning technique of logistic regression to classify the references as \"linked\" or \"not linked\". The objective of the experiments was to compare the linking performance of both methods to evaluate the effectiveness of logistic regression as an extension to the existing match functions provided in the OYSTER ER System. One experiment used actual school enrollment data and the other used synthetic data. In both cases the performance of the logistic regression classification compared favorably with rule-based results.","PeriodicalId":320000,"journal":{"name":"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Entity Resolution Using Logistic Regression as an extension to the Rule-Based Oyster System\",\"authors\":\"Fumiko Kobayashi, Aziz Eram, J. Talburt\",\"doi\":\"10.1109/MIPR.2018.00033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes two experiments in entity resolution. In both experiments, person references were classified as \\\"linked\\\" or \\\"not linked\\\" by two different methods. The first method used an entity resolution (ER) system and employed standard \\\"if-then\\\" Boolean matching rules. The second method used the supervised machine learning technique of logistic regression to classify the references as \\\"linked\\\" or \\\"not linked\\\". The objective of the experiments was to compare the linking performance of both methods to evaluate the effectiveness of logistic regression as an extension to the existing match functions provided in the OYSTER ER System. One experiment used actual school enrollment data and the other used synthetic data. In both cases the performance of the logistic regression classification compared favorably with rule-based results.\",\"PeriodicalId\":320000,\"journal\":{\"name\":\"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MIPR.2018.00033\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MIPR.2018.00033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Entity Resolution Using Logistic Regression as an extension to the Rule-Based Oyster System
This paper describes two experiments in entity resolution. In both experiments, person references were classified as "linked" or "not linked" by two different methods. The first method used an entity resolution (ER) system and employed standard "if-then" Boolean matching rules. The second method used the supervised machine learning technique of logistic regression to classify the references as "linked" or "not linked". The objective of the experiments was to compare the linking performance of both methods to evaluate the effectiveness of logistic regression as an extension to the existing match functions provided in the OYSTER ER System. One experiment used actual school enrollment data and the other used synthetic data. In both cases the performance of the logistic regression classification compared favorably with rule-based results.