{"title":"半结构化文档的语义内核","authors":"S. Aseervatham, E. Viennet, Younès Bennani","doi":"10.1109/ICDM.2007.23","DOIUrl":null,"url":null,"abstract":"Natural Language Processing has emerged as an active field of research in the machine learning community. Several methods based on statistical information have been proposed. However, with the linguistic complexity of the texts, semantic-based approaches have been investigated. In this paper, we propose a Semantic Kernel for semi- structured biomedical documents. The semantic meanings of words are extracted using the UMLS framework. The kernel, with a SVM classifier, has been applied to a text categorization task on a medical corpus of free text documents. The results have shown that the Semantic Kernel outperforms the Linear Kernel and the Naive Bayes classifier. Moreover, this kernel was ranked in the top ten of the best algorithms among 44 classification methods at the 2007 CMC Medical NLP International Challenge.","PeriodicalId":233758,"journal":{"name":"Seventh IEEE International Conference on Data Mining (ICDM 2007)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Semantic Kernel for Semi-structured DocumentS\",\"authors\":\"S. Aseervatham, E. Viennet, Younès Bennani\",\"doi\":\"10.1109/ICDM.2007.23\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Natural Language Processing has emerged as an active field of research in the machine learning community. Several methods based on statistical information have been proposed. However, with the linguistic complexity of the texts, semantic-based approaches have been investigated. In this paper, we propose a Semantic Kernel for semi- structured biomedical documents. The semantic meanings of words are extracted using the UMLS framework. The kernel, with a SVM classifier, has been applied to a text categorization task on a medical corpus of free text documents. The results have shown that the Semantic Kernel outperforms the Linear Kernel and the Naive Bayes classifier. Moreover, this kernel was ranked in the top ten of the best algorithms among 44 classification methods at the 2007 CMC Medical NLP International Challenge.\",\"PeriodicalId\":233758,\"journal\":{\"name\":\"Seventh IEEE International Conference on Data Mining (ICDM 2007)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Seventh IEEE International Conference on Data Mining (ICDM 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM.2007.23\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seventh IEEE International Conference on Data Mining (ICDM 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2007.23","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Natural Language Processing has emerged as an active field of research in the machine learning community. Several methods based on statistical information have been proposed. However, with the linguistic complexity of the texts, semantic-based approaches have been investigated. In this paper, we propose a Semantic Kernel for semi- structured biomedical documents. The semantic meanings of words are extracted using the UMLS framework. The kernel, with a SVM classifier, has been applied to a text categorization task on a medical corpus of free text documents. The results have shown that the Semantic Kernel outperforms the Linear Kernel and the Naive Bayes classifier. Moreover, this kernel was ranked in the top ten of the best algorithms among 44 classification methods at the 2007 CMC Medical NLP International Challenge.