Hassan Shahmohammadi, M. Dezfoulian, Muharram Mansoorizadeh
{"title":"意译检测中特征提取方法的广泛比较","authors":"Hassan Shahmohammadi, M. Dezfoulian, Muharram Mansoorizadeh","doi":"10.1109/ICCKE.2018.8566303","DOIUrl":null,"url":null,"abstract":"Paraphrase detection is one of the fundamental tasks in natural language processing. Designing a system to detect the paraphrase pairs requires a good understanding of different feature extraction methods. To tackle this challenge, lots of work have been done to extract various types of features. Knowing which types of features are discriminant for paraphrase identification, saves a lot of time for researchers and helps them obtain better result in their works. In this paper we compare various types of feature extraction methods that neither need any prior knowledge nor any external resources, so they can be used in every language. Our experiments show that those types of methods which specify the importance of each word in documents or break down the document into specific parts, have a better result compared to those methods that try to capture the meaning of a given document as a whole and treat the document as a single component.","PeriodicalId":283700,"journal":{"name":"2018 8th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An Extensive Comparison of Feature Extraction Methods for Paraphrase Detection\",\"authors\":\"Hassan Shahmohammadi, M. Dezfoulian, Muharram Mansoorizadeh\",\"doi\":\"10.1109/ICCKE.2018.8566303\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Paraphrase detection is one of the fundamental tasks in natural language processing. Designing a system to detect the paraphrase pairs requires a good understanding of different feature extraction methods. To tackle this challenge, lots of work have been done to extract various types of features. Knowing which types of features are discriminant for paraphrase identification, saves a lot of time for researchers and helps them obtain better result in their works. In this paper we compare various types of feature extraction methods that neither need any prior knowledge nor any external resources, so they can be used in every language. Our experiments show that those types of methods which specify the importance of each word in documents or break down the document into specific parts, have a better result compared to those methods that try to capture the meaning of a given document as a whole and treat the document as a single component.\",\"PeriodicalId\":283700,\"journal\":{\"name\":\"2018 8th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 8th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCKE.2018.8566303\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 8th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE.2018.8566303","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Extensive Comparison of Feature Extraction Methods for Paraphrase Detection
Paraphrase detection is one of the fundamental tasks in natural language processing. Designing a system to detect the paraphrase pairs requires a good understanding of different feature extraction methods. To tackle this challenge, lots of work have been done to extract various types of features. Knowing which types of features are discriminant for paraphrase identification, saves a lot of time for researchers and helps them obtain better result in their works. In this paper we compare various types of feature extraction methods that neither need any prior knowledge nor any external resources, so they can be used in every language. Our experiments show that those types of methods which specify the importance of each word in documents or break down the document into specific parts, have a better result compared to those methods that try to capture the meaning of a given document as a whole and treat the document as a single component.