{"title":"跨学科博士论文的动态词汇特征:一种文本挖掘方法","authors":"Wei Xiao, S. Sun","doi":"10.1080/09296174.2018.1531618","DOIUrl":null,"url":null,"abstract":"ABSTRACT This study employed a text mining method to investigate the lexical features and their dynamic changes of PhD theses across the natural sciences, social sciences and humanities. Four quantitative indices, i.e. TTR, h-point, R1 and writer’s view, were employed to analyze 150 PhD theses (50 theses from each discipline). Although h-point and writer’s view were found counter-intuitively to show insignificant variation across disciplines, the results of TTR and R1 did reveal sharp contrasts between theses in humanities and natural sciences. While the second half of humanities theses showed a significantly higher level of lexical diversity, indicated by higher TTR, theses in natural sciences tended to be richer in content words in the first half, indicated by a higher R1. Meanwhile, theses in social sciences seemed to be more moderate, with features lying in the middle position. This study has implications not only for the widening of applications of quantitative linguistic methods but also for academic writing (especially PhD thesis writing) instruction and practice.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":null,"pages":null},"PeriodicalIF":0.7000,"publicationDate":"2018-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2018.1531618","citationCount":"14","resultStr":"{\"title\":\"Dynamic Lexical Features of PhD Theses across Disciplines: A Text Mining Approach\",\"authors\":\"Wei Xiao, S. Sun\",\"doi\":\"10.1080/09296174.2018.1531618\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACT This study employed a text mining method to investigate the lexical features and their dynamic changes of PhD theses across the natural sciences, social sciences and humanities. Four quantitative indices, i.e. TTR, h-point, R1 and writer’s view, were employed to analyze 150 PhD theses (50 theses from each discipline). Although h-point and writer’s view were found counter-intuitively to show insignificant variation across disciplines, the results of TTR and R1 did reveal sharp contrasts between theses in humanities and natural sciences. While the second half of humanities theses showed a significantly higher level of lexical diversity, indicated by higher TTR, theses in natural sciences tended to be richer in content words in the first half, indicated by a higher R1. Meanwhile, theses in social sciences seemed to be more moderate, with features lying in the middle position. This study has implications not only for the widening of applications of quantitative linguistic methods but also for academic writing (especially PhD thesis writing) instruction and practice.\",\"PeriodicalId\":45514,\"journal\":{\"name\":\"Journal of Quantitative Linguistics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.7000,\"publicationDate\":\"2018-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1080/09296174.2018.1531618\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Quantitative Linguistics\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1080/09296174.2018.1531618\",\"RegionNum\":2,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Quantitative Linguistics","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1080/09296174.2018.1531618","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
Dynamic Lexical Features of PhD Theses across Disciplines: A Text Mining Approach
ABSTRACT This study employed a text mining method to investigate the lexical features and their dynamic changes of PhD theses across the natural sciences, social sciences and humanities. Four quantitative indices, i.e. TTR, h-point, R1 and writer’s view, were employed to analyze 150 PhD theses (50 theses from each discipline). Although h-point and writer’s view were found counter-intuitively to show insignificant variation across disciplines, the results of TTR and R1 did reveal sharp contrasts between theses in humanities and natural sciences. While the second half of humanities theses showed a significantly higher level of lexical diversity, indicated by higher TTR, theses in natural sciences tended to be richer in content words in the first half, indicated by a higher R1. Meanwhile, theses in social sciences seemed to be more moderate, with features lying in the middle position. This study has implications not only for the widening of applications of quantitative linguistic methods but also for academic writing (especially PhD thesis writing) instruction and practice.
期刊介绍:
The Journal of Quantitative Linguistics is an international forum for the publication and discussion of research on the quantitative characteristics of language and text in an exact mathematical form. This approach, which is of growing interest, opens up important and exciting theoretical perspectives, as well as solutions for a wide range of practical problems such as machine learning or statistical parsing, by introducing into linguistics the methods and models of advanced scientific disciplines such as the natural sciences, economics, and psychology.