{"title":"具有标准函数和间隙统计的自动群集停止","authors":"Ted Pedersen, Anagha Kulkarni","doi":"10.3115/1225785.1225792","DOIUrl":null,"url":null,"abstract":"SenseClusters is a freely available system that clusters similar contexts. It can be applied to a wide range of problems, although here we focus on word sense and name discrimination. It supports several different measures for automatically determining the number of clusters in which a collection of contexts should be grouped. These can be used to discover the number of senses in which a word is used in a large corpus of text, or the number of entities that share the same name. There are three measures based on clustering criterion functions, and another on the Gap Statistic.","PeriodicalId":215206,"journal":{"name":"Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology companion volume: demonstrations -","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"47","resultStr":"{\"title\":\"Automatic Cluster Stopping with Criterion Functions and the Gap Statistic\",\"authors\":\"Ted Pedersen, Anagha Kulkarni\",\"doi\":\"10.3115/1225785.1225792\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"SenseClusters is a freely available system that clusters similar contexts. It can be applied to a wide range of problems, although here we focus on word sense and name discrimination. It supports several different measures for automatically determining the number of clusters in which a collection of contexts should be grouped. These can be used to discover the number of senses in which a word is used in a large corpus of text, or the number of entities that share the same name. There are three measures based on clustering criterion functions, and another on the Gap Statistic.\",\"PeriodicalId\":215206,\"journal\":{\"name\":\"Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology companion volume: demonstrations -\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"47\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology companion volume: demonstrations -\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3115/1225785.1225792\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology companion volume: demonstrations -","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3115/1225785.1225792","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic Cluster Stopping with Criterion Functions and the Gap Statistic
SenseClusters is a freely available system that clusters similar contexts. It can be applied to a wide range of problems, although here we focus on word sense and name discrimination. It supports several different measures for automatically determining the number of clusters in which a collection of contexts should be grouped. These can be used to discover the number of senses in which a word is used in a large corpus of text, or the number of entities that share the same name. There are three measures based on clustering criterion functions, and another on the Gap Statistic.