Workshop on Graph-based Methods for Natural Language Processing最新文献

筛选
英文 中文
Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences in dialectology 二部谱图划分对方言变体和声音对应的共聚类
Workshop on Graph-based Methods for Natural Language Processing Pub Date : 2009-08-07 DOI: 10.3115/1708124.1708129
Martijn Wieling, J. Nerbonne
{"title":"Bipartite spectral graph partitioning to co-cluster varieties and sound correspondences in dialectology","authors":"Martijn Wieling, J. Nerbonne","doi":"10.3115/1708124.1708129","DOIUrl":"https://doi.org/10.3115/1708124.1708129","url":null,"abstract":"In this study we used bipartite spectral graph partitioning to simultaneously cluster varieties and sound correspondences in Dutch dialect data. While clustering geographical varieties with respect to their pronunciation is not new, the simultaneous identification of the sound correspondences giving rise to the geographical clustering presents a novel opportunity in dialectometry. Earlier methods aggregated sound differences and clustered on the basis of aggregate differences. The determination of the significant sound correspondences which co-varied with cluster membership was carried out on a post hoc basis. Bipartite spectral graph clustering simultaneously seeks groups of individual sound correspondences which are associated, even while seeking groups of sites which share sound correspondences. We show that the application of this method results in clear and sensible geographical groupings and discuss the concomitant sound correspondences.","PeriodicalId":359354,"journal":{"name":"Workshop on Graph-based Methods for Natural Language Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133989859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Social (distributed) language modeling, clustering and dialectometry 社会(分布式)语言建模、聚类和方言法
Workshop on Graph-based Methods for Natural Language Processing Pub Date : 2009-08-07 DOI: 10.3115/1708124.1708126
David Ellis
{"title":"Social (distributed) language modeling, clustering and dialectometry","authors":"David Ellis","doi":"10.3115/1708124.1708126","DOIUrl":"https://doi.org/10.3115/1708124.1708126","url":null,"abstract":"We present ongoing work in a scalable, distributed implementation of over 200 million individual language models, each capturing a single user's dialect in a given language (multilingual users have several models). These have a variety of practical applications, ranging from spam detection to speech recognition, and dialectometrical methods on the social graph. Users should be able to view any content in their language (even if it is spoken by a small population), and to browse our site with appropriately translated interface (automatically generated, for locales with little crowd-sourced community effort).","PeriodicalId":359354,"journal":{"name":"Workshop on Graph-based Methods for Natural Language Processing","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117186511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Cohesion Graph Based Approach for Unsupervised Recognition of Literal and Non-literal Use of Multiword Expressions 一种基于聚类图的多词短语字面和非字面无监督识别方法
Workshop on Graph-based Methods for Natural Language Processing Pub Date : 2009-08-07 DOI: 10.3115/1708124.1708139
Linlin Li, C. Sporleder
{"title":"A Cohesion Graph Based Approach for Unsupervised Recognition of Literal and Non-literal Use of Multiword Expressions","authors":"Linlin Li, C. Sporleder","doi":"10.3115/1708124.1708139","DOIUrl":"https://doi.org/10.3115/1708124.1708139","url":null,"abstract":"We present a graph-based model for representing the lexical cohesion of a discourse. In the graph structure, vertices correspond to the content words of a text and edges connecting pairs of words encode how closely the words are related semantically. We show that such a structure can be used to distinguish literal and non-literal usages of multi-word expressions.","PeriodicalId":359354,"journal":{"name":"Workshop on Graph-based Methods for Natural Language Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121491508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Quantitative analysis of treebanks using frequent subtree mining methods 利用频繁子树挖掘方法对树库进行定量分析
Workshop on Graph-based Methods for Natural Language Processing Pub Date : 2009-08-07 DOI: 10.3115/1708124.1708140
S. Martens
{"title":"Quantitative analysis of treebanks using frequent subtree mining methods","authors":"S. Martens","doi":"10.3115/1708124.1708140","DOIUrl":"https://doi.org/10.3115/1708124.1708140","url":null,"abstract":"The first task of statistical computational linguistics, or any other type of data-driven processing of language, is the extraction of counts and distributions of phenomena. This is much more difficult for the type of complex structured data found in treebanks and in corpora with sophisticated annotation than for tokenized texts. Recent developments in data mining, particularly in the extraction of frequent subtrees from treebanks, offer some solutions. We have applied a modified version of the TreeMiner algorithm to a small treebank and present some promising results.","PeriodicalId":359354,"journal":{"name":"Workshop on Graph-based Methods for Natural Language Processing","volume":"192 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123262043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Network analysis reveals structure indicative of syntax in the corpus of undeciphered Indus civilization inscriptions 网络分析揭示了未破译的印度河文明铭文语料库中指示句法的结构
Workshop on Graph-based Methods for Natural Language Processing Pub Date : 2009-08-07 DOI: 10.3115/1708124.1708128
S. Sinha, R. K. Pan, N. Yadav, M. Vahia, I. Mahadevan
{"title":"Network analysis reveals structure indicative of syntax in the corpus of undeciphered Indus civilization inscriptions","authors":"S. Sinha, R. K. Pan, N. Yadav, M. Vahia, I. Mahadevan","doi":"10.3115/1708124.1708128","DOIUrl":"https://doi.org/10.3115/1708124.1708128","url":null,"abstract":"Archaeological excavations in the sites of the Indus Valley civilization (2500-1900 BCE) in Pakistan and northwestern India have unearthed a large number of artifacts with inscriptions made up of hundreds of distinct signs. To date, there is no generally accepted decipherment of these sign sequences, and there have been suggestions that the signs could be non-linguistic. Here we apply complex network analysis techniques on the data-base of available Indus inscriptions, with the aim of detecting patterns indicative of syntactic structure in this sign system. Our results show the presence of regularities, e.g., in the segmentation trees of the sequences, that suggest the existence of a grammar underlying the construction of the sequences.","PeriodicalId":359354,"journal":{"name":"Workshop on Graph-based Methods for Natural Language Processing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126612173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Random Walks for Text Semantic Similarity 文本语义相似度的随机漫步
Workshop on Graph-based Methods for Natural Language Processing Pub Date : 2009-08-07 DOI: 10.3115/1708124.1708131
Daniel Ramage, Anna N. Rafferty, Christopher D. Manning
{"title":"Random Walks for Text Semantic Similarity","authors":"Daniel Ramage, Anna N. Rafferty, Christopher D. Manning","doi":"10.3115/1708124.1708131","DOIUrl":"https://doi.org/10.3115/1708124.1708131","url":null,"abstract":"Many tasks in NLP stand to benefit from robust measures of semantic similarity for units above the level of individual words. Rich semantic resources such as WordNet provide local semantic information at the lexical level. However, effectively combining this information to compute scores for phrases or sentences is an open problem. Our algorithm aggregates local relatedness information via a random walk over a graph constructed from an underlying lexical resource. The stationary distribution of the graph walk forms a \"semantic signature\" that can be compared to another such distribution to get a relat-edness score for texts. On a paraphrase recognition task, the algorithm achieves an 18.5% relative reduction in error rate over a vector-space baseline. We also show that the graph walk similarity between texts has complementary value as a feature for recognizing textual entailment, improving on a competitive baseline system.","PeriodicalId":359354,"journal":{"name":"Workshop on Graph-based Methods for Natural Language Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133100810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 104
Classifying Japanese Polysemous Verbs based on Fuzzy C-means Clustering 基于模糊c均值聚类的日语多义动词分类
Workshop on Graph-based Methods for Natural Language Processing Pub Date : 2009-08-07 DOI: 10.3115/1708124.1708132
Yoshimi Suzuki, Fumiyo Fukumoto
{"title":"Classifying Japanese Polysemous Verbs based on Fuzzy C-means Clustering","authors":"Yoshimi Suzuki, Fumiyo Fukumoto","doi":"10.3115/1708124.1708132","DOIUrl":"https://doi.org/10.3115/1708124.1708132","url":null,"abstract":"This paper presents a method for classifying Japanese polysemous verbs using an algorithm to identify overlapping nodes with more than one cluster. The algorithm is a graph-based unsupervised clustering algorithm, which combines a generalized modularity function, spectral mapping, and fuzzy clustering technique. The modularity function for measuring cluster structure is calculated based on the frequency distributions over verb frames with selectional preferences. Evaluations are made on two sets of verbs including polysemies.","PeriodicalId":359354,"journal":{"name":"Workshop on Graph-based Methods for Natural Language Processing","volume":"236 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133389490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Measuring semantic relatedness with vector space models and random walks 用向量空间模型和随机漫步测量语义相关性
Workshop on Graph-based Methods for Natural Language Processing Pub Date : 2009-08-07 DOI: 10.3115/1708124.1708134
Amac Herdagdelen, K. Erk, Marco Baroni
{"title":"Measuring semantic relatedness with vector space models and random walks","authors":"Amac Herdagdelen, K. Erk, Marco Baroni","doi":"10.3115/1708124.1708134","DOIUrl":"https://doi.org/10.3115/1708124.1708134","url":null,"abstract":"Both vector space models and graph random walk models can be used to determine similarity between concepts. Noting that vectors can be regarded as local views of a graph, we directly compare vector space models and graph random walk models on standard tasks of predicting human similarity ratings, concept categorization, and semantic priming, varying the size of the dataset from which vector space and graph are extracted.","PeriodicalId":359354,"journal":{"name":"Workshop on Graph-based Methods for Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131359164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Graph-based Event Coreference Resolution 基于图的事件共同引用解析
Workshop on Graph-based Methods for Natural Language Processing Pub Date : 2009-08-07 DOI: 10.3115/1708124.1708135
Zheng Chen, Heng Ji
{"title":"Graph-based Event Coreference Resolution","authors":"Zheng Chen, Heng Ji","doi":"10.3115/1708124.1708135","DOIUrl":"https://doi.org/10.3115/1708124.1708135","url":null,"abstract":"In this paper, we address the problem of event coreference resolution as specified in the Automatic Content Extraction (ACE) program. In contrast to entity coreference resolution, event coreference resolution has not received great attention from researchers. In this paper, we first demonstrate the diverse scenarios of event coreference by an example. We then model event coreference resolution as a spectral graph clustering problem and evaluate the clustering algorithm on ground truth event mentions using ECM F-Measure. We obtain the ECM-F scores of 0.8363 and 0.8312 respectively by using two methods for computing coreference matrices.","PeriodicalId":359354,"journal":{"name":"Workshop on Graph-based Methods for Natural Language Processing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122961688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 103
Ranking and Semi-supervised Classification on Large Scale Graphs Using Map-Reduce 基于Map-Reduce的大比例图排序与半监督分类
Workshop on Graph-based Methods for Natural Language Processing Pub Date : 2009-08-07 DOI: 10.3115/1708124.1708137
D. Rao, David Yarowsky
{"title":"Ranking and Semi-supervised Classification on Large Scale Graphs Using Map-Reduce","authors":"D. Rao, David Yarowsky","doi":"10.3115/1708124.1708137","DOIUrl":"https://doi.org/10.3115/1708124.1708137","url":null,"abstract":"Label Propagation, a standard algorithm for semi-supervised classification, suffers from scalability issues involving memory and computation when used with large-scale graphs from real-world datasets. In this paper we approach Label Propagation as solution to a system of linear equations which can be implemented as a scalable parallel algorithm using the map-reduce framework. In addition to semi-supervised classification, this approach to Label Propagation allows us to adapt the algorithm to make it usable for ranking on graphs and derive the theoretical connection between Label Propagation and PageRank. We provide empirical evidence to that effect using two natural language tasks -- lexical relat-edness and polarity induction. The version of the Label Propagation algorithm presented here scales linearly in the size of the data with a constant main memory requirement, in contrast to the quadratic cost of both in traditional approaches.","PeriodicalId":359354,"journal":{"name":"Workshop on Graph-based Methods for Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130863153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信