Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)最新文献

筛选
英文 中文
Gradient constraints on the use of Estonian possessive reflexives 爱沙尼亚语所有格反身代词使用的梯度限制
S. Lesage, Olivier Bonami
{"title":"Gradient constraints on the use of Estonian possessive reflexives","authors":"S. Lesage, Olivier Bonami","doi":"10.18653/v1/W19-7914","DOIUrl":"https://doi.org/10.18653/v1/W19-7914","url":null,"abstract":"We report on a corpus study of the use of reflexive vs. nonreflexive possessives in Estonian sentences headed by verbs taking an allative argument. We parsed the Estonian National Corpus using UDPipe trained with the Estonian Dependency Corpus, extracted relevant data automatically, eliminated false positives and annotated the data by hand. This allowed us to document effects of grammatical functions, word order and person on the choice of a reflexive vs. nonreflexive, using generalized linear mixed models. We hypothesize that the documented effects are due to the combined effects of grammatical relations, information structure, and ambiguity avoidance.","PeriodicalId":196648,"journal":{"name":"Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131644434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
What can we learn from natural and artificial dependency trees 我们可以从自然依赖树和人工依赖树中学到什么
M. Courtin, Chunxiao Yan
{"title":"What can we learn from natural and artificial dependency trees","authors":"M. Courtin, Chunxiao Yan","doi":"10.18653/v1/W19-7915","DOIUrl":"https://doi.org/10.18653/v1/W19-7915","url":null,"abstract":"This paper is centered around two main contributions : the first one consists in introducing several procedures for generating random dependency trees with constraints; we later use these artificial trees to compare their properties with the properties of natural trees (i.e trees extracted from treebanks) and analyze the relationships between these properties in natural and artificial settings in order to find out which relationships are formally constrained and which are linguistically motivated. We take into consideration five metrics: tree length, height, maximum arity, mean dependency distance and mean flux weight, and also look into the distribution of local configurations of nodes. This analysis is based on UD treebanks (version 2.3, Nivre et al. 2018) for four languages: Chinese, English, French and Ja-panese.","PeriodicalId":196648,"journal":{"name":"Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125809871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
The relation between dependency distance and frequency 依赖距离与频率的关系
Xinying Chen, Kim Gerdes
{"title":"The relation between dependency distance and frequency","authors":"Xinying Chen, Kim Gerdes","doi":"10.18653/v1/W19-7909","DOIUrl":"https://doi.org/10.18653/v1/W19-7909","url":null,"abstract":"This present pilot study investigates the relationship between dependency distance and frequency based on the analysis of an English dependency treebank. The preliminary result shows that there is a non-linear relation between dependency distance and frequency. This relation between them can be further formalized as a power law function which can be used to predict the distribution of dependency distance in a treebank.","PeriodicalId":196648,"journal":{"name":"Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126053334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
SyntaxFest 2019 Invited talk - Dependency distance minimization: facts, theory and predictions SyntaxFest 2019邀请演讲-依赖距离最小化:事实,理论和预测
R. Ferrer-i-Cancho
{"title":"SyntaxFest 2019 Invited talk - Dependency distance minimization: facts, theory and predictions","authors":"R. Ferrer-i-Cancho","doi":"10.18653/v1/W19-7901","DOIUrl":"https://doi.org/10.18653/v1/W19-7901","url":null,"abstract":"Quantitative linguistics is a branch of linguistics concerned about the study of statistical facts about languages and their explanation aiming at constructing a general theory of language. The quantitative study of syntax has become central to this branch of linguistics. The fact that the distance between syntactically related words is smaller than expected by chance in many languages led to the formulation of a dependency distance minimization (DDm) principle. From a theoretical standpoint, DDm is in conflict with another word order principle: surprisal minimization (Sm). In single head structures, DDm predicts that the head should be put at the center of the linear arrangement, while Sm predicts that it should be put at one of the ends. In spite of the massive evidence of the action of DDm and the trendy claim that languages are optimized, attempts to quantify the degree of optimization of languages according to DDm have been rather scarce. Here we present a new optimality measure indicating that languages are optimized to a 70We confirm two old theoretical predictions: that the action of DDm is stronger in longer sentences and that DDm is more likely to be beaten by Sm in short sequences (resulting in an anti-DDm effect), while shedding new light on the kind of tree structures where DDm is more likely to be shadowed. Finally, we review various theoretical predictions of DDm focusing on the scarcity of crossing dependencies. We challenge the belief that formal constraints on dependency trees (e.g., projectivity or relaxed versions) are real rather than epiphenomenal. The talk is a summary of joint work with Carlos Gomez-Rodriguez, Juan Luis Esteban, Morten Christiansen, Lluis Alemany-Puig and Xinying Chen.","PeriodicalId":196648,"journal":{"name":"Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117114564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Length of non-projective sentences: A pilot study using a Czech UD treebank 非投射句的长度:使用捷克语UD树库的初步研究
Ján Mačutek, Radek Čech, Jiří Milička
{"title":"Length of non-projective sentences: A pilot study using a Czech UD treebank","authors":"Ján Mačutek, Radek Čech, Jiří Milička","doi":"10.18653/v1/W19-7913","DOIUrl":"https://doi.org/10.18653/v1/W19-7913","url":null,"abstract":"Lengths (in words) of projective and non-projective sentences from a Czech UD dependency treebank are compared. It is shown that non-projective sentences are significantly longer (in addition, the same result was obtained in this study also for Arabic, Polish, Russian, and Slovak). The hyperpascal distribution, which was suggested as the model for frequency distribution of sentence length measured in words, fits well the data from both projective and non-projective sentences; however, its parameters attain different values for the two groups. Proportions of non-projective sentences in the treebanks used are presented, together with a discussion on factors which can influence them.","PeriodicalId":196648,"journal":{"name":"Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114381826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Information-theoretic locality properties of natural language 自然语言的信息论定位特性
Richard Futrell
{"title":"Information-theoretic locality properties of natural language","authors":"Richard Futrell","doi":"10.18653/v1/W19-7902","DOIUrl":"https://doi.org/10.18653/v1/W19-7902","url":null,"abstract":"I present theoretical arguments and new empirical evidence for an information-theoretic principle of word order: information locality, the idea that words that strongly predict each other should be close to each other in linear order. I show that information locality can be derived under the assumption that natural language is a code that enables efficient communication while minimizing information-processing costs involved in online language comprehension, using recent psycholinguistic theories to characterize those processing costs information-theoretically. I argue that information locality subsumes and extends the previously-proposed principle of dependency length minimization (DLM), which has shown great explanatory power for predicting word order in many languages. Finally, I show corpus evidence that information locality has improved explanatory power over DLM in two domains: in predicting which dependencies will have shorter and longer lengths across 50 languages, and in predicting the preferred order of adjectives in English.","PeriodicalId":196648,"journal":{"name":"Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131359762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
An explanation of the decisive role of function words in driving syntactic development 虚词对句法发展的决定性作用的解释
A. Ninio
{"title":"An explanation of the decisive role of function words in driving syntactic development","authors":"A. Ninio","doi":"10.18653/v1/W19-7907","DOIUrl":"https://doi.org/10.18653/v1/W19-7907","url":null,"abstract":"The early mastery of function words (FWs) better predicts children’s concurrent and subsequent syntactic development than their acquisition of content words (CWs). Wishing to understand why the advantage of the early mastering of a FW vocabulary, we tested the hypothesis that the learning of FWs involves learning their syntax to a higher degree than is the case for CWs. English-language parental (N=506) and young children’s speech samples (N=350) were taken from the CHILDES archive. We mapped the use of words of different form-classes in parental speech, comparing the words’ occurrence as single-word utterances and as the heads of two-word long syntactically structured sentences. The distributions showed a dramatic effect of form-class: the four FW categories subordinators, determiners, prepositions and auxiliary verbs are used by parents almost exclusively in multiword utterances. By contrast, words in the four CW categories verbs, nouns, adjectives and adverbs appear both as single-word utterances and as roots of twoword sentences. Analysis of children’s talk had similar results, the proportions correlating very highly with parents’. Acquisition of FWs predicts syntactic development because they must be learned as combining words, whereas CWs can be learned as stand-alone lexemes, without mastering their syntax. 1. The research question 1.1 FWs predict syntactic development better than CWs Grammatical words such as determiners, auxiliary verbs and prepositions had long been considered marginal for the early stages of syntactic development. Many authorities such as Radford (1990) believed that such ‘function words’ (FWs) are acquired late by typically-developing children, and that, at the early stages of acquisition, syntactic development relies on ‘content words’ (CWs) or ‘lexical words’ (nouns, verbs, adjectives and adverbs), that carry semantic relations which can be expressed as patterned speech (Brown, 1973). In the last few years the trend has turned, as recently developmental studies have been offering some new evidence for the importance of the early mastery of FWs for syntactic development. In several studies it was found that in children acquiring various languages, the early mastery of FWs such as subordinators, auxiliary verbs, prepositions and determiners strongly predicts children’s concurrent and especially subsequent syntactic development. Kedar et al. (2006) found that 18and 24-month-old infants acquiring English oriented faster and more accurately to a visual target following sentences in which the referential expression included determiners. They concluded that by 18 months of age, infants use their knowledge of determiners when they process sentences and establish reference. Le Normand et al. (2013) examined the speech of French-speaking children aged 2-4 years and correlated the diversity of word types in various form-classes with the children’s mean length of utterance in words (MLU). They found that the diversity of word t","PeriodicalId":196648,"journal":{"name":"Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122904748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A quantitative probe into the hierarchical structure of written Chinese 书面语层次结构的定量探讨
Heng Chen, Haitao Liu
{"title":"A quantitative probe into the hierarchical structure of written Chinese","authors":"Heng Chen, Haitao Liu","doi":"10.18653/v1/w19-7904","DOIUrl":"https://doi.org/10.18653/v1/w19-7904","url":null,"abstract":"","PeriodicalId":196648,"journal":{"name":"Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121057051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Dependency Length Minimization vs. Word Order Constraints: An Empirical Study On 55 Treebanks 依赖长度最小化与词序约束:基于55个树库的实证研究
Xiang Yu, Agnieszka Falenska, Jonas Kuhn
{"title":"Dependency Length Minimization vs. Word Order Constraints: An Empirical Study On 55 Treebanks","authors":"Xiang Yu, Agnieszka Falenska, Jonas Kuhn","doi":"10.18653/v1/W19-7911","DOIUrl":"https://doi.org/10.18653/v1/W19-7911","url":null,"abstract":"This paper expands on recent studies of very large treebank collections aiming to find empirical evidence for language universals, specifically for the functionally motivated Dependency Length Minimization (DLM) hypothesis. According to DLM grammars are set up to support the expression of utterances in a way that minimizes the distance between heads and dependents. We construct several incremental baselines that lead from the random free order linearization to the real language by adding various word order constraints. We conduct detailed analyses on 55 treebanks and find that all of the constraints contribute to DLM. We show that DLM on the one hand shapes the regularity and on the other motivates the attested exceptions from canonical word order. The findings contribute to a more fine-grained, differentiated picture of the role of DLM in the interaction of competing constraints on grammar and language use. 1 Motivation and Background The recent development of comparable dependency treebanks for a considerable number of languages across the typological spectrum (Nivre et al., 2016) has made it possible to address some long-standing hypotheses regarding a functional explanation of linguistic universals. A number of recent papers (Liu, 2008; Futrell et al., 2015, a.o.) have used evidence from treebanks across languages to address what is arguably the most prominent hypothesis of a functionally motivated universal constraint, the Dependency Length Minimization (DLM) hypothesis, which can be traced back to (Behaghel, 1932). Phrased as a language typological universal, the DLM hypothesis states that the evolution of languages is driven by the constraint that grammars should allow dependents to be realized as closely as possible to their heads – which is known to reduce the cognitive burden in processing (Gibson, 1998; Gibson, 2000). The Dependency Length (DL) of a sentence is defined as the sum of the distance between the head and dependent of all the dependency arcs in the sentence (see the example in Figure 1). John threw out the trash sitting in the kitchen PROPN VERB ADP DET NOUN VERB ADP DET NOUN root nsubj (1) comp:prt (1) advcl (1) det (1) obj (3) obl (3)","PeriodicalId":196648,"journal":{"name":"Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121392244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Comparative Corpus Analysis of PP Ordering in English and Chinese 英汉PP排序的语料库对比分析
Zoey Liu
{"title":"A Comparative Corpus Analysis of PP Ordering in English and Chinese","authors":"Zoey Liu","doi":"10.18653/v1/W19-7905","DOIUrl":"https://doi.org/10.18653/v1/W19-7905","url":null,"abstract":"We present a comparative analysis of PP ordering in English and (Mandarin) Chinese, two languages with distinct typological word order characteristics. Previous work on PP orderings have mainly focused on English using data of relatively small size. Here we leverage corpora of much larger scale with straightforward annotations. We use the Penn Treebank for English, which includes three corpora that cover both written and spoken domains, and the Chinese Penn Treebank for Chinese. We explore the individual effect of dependency length, the argument status of the PP (argument or adjunct) and the traditional adverbial ordering rule, Manner before Place before Time. In addition, we evaluate the predictive power of dependency length and argument status with weights estimated from logistic regression models. We show that while dependency length plays a strong role across genre for English, it only exerts a mild effect in Chinese. On the other hand, the argument status of the PP has a pronounced role in both languages, that is, there exists a strong tendency for the argument-like PP to appear closer to the head verb than the adjunct-like PP. Our work contributes empirically to the long-standing proposal in linguistic typology that crosslinguistic word ordering preference is driven by cooperating and competing principles.","PeriodicalId":196648,"journal":{"name":"Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121594592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信