{"title":"为第二语言写作研究建立自定义 NLP 工具以标注话语功能特征:教程","authors":"Masaki Eguchi , Kristopher Kyle","doi":"10.1016/j.rmal.2024.100153","DOIUrl":null,"url":null,"abstract":"<div><p>The current tutorial paper describes a process of developing a custom natural language processing model with a particular focus on a discourse annotation task. After an overview of recent developments in natural language processing (NLP), the paper discusses the development of the Engagement Analyzer (<span><span>Eguchi & Kyle, 2023</span></span>), focusing on corpus annotation, the machine learning model, model training, evaluation, and dissemination. A step-by-step tutorial of this process via the spaCy Python package is provided. The paper highlights the feasibility of developing custom NLP tools to enhance the scalability and replicability of the annotation of context-sensitive linguistic features in L2 writing research.</p></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"3 3","pages":"Article 100153"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772766124000594/pdfft?md5=6448551cb4500b4275ce41c9341843c4&pid=1-s2.0-S2772766124000594-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Building custom NLP tools to annotate discourse-functional features for second language writing research: A tutorial\",\"authors\":\"Masaki Eguchi , Kristopher Kyle\",\"doi\":\"10.1016/j.rmal.2024.100153\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The current tutorial paper describes a process of developing a custom natural language processing model with a particular focus on a discourse annotation task. After an overview of recent developments in natural language processing (NLP), the paper discusses the development of the Engagement Analyzer (<span><span>Eguchi & Kyle, 2023</span></span>), focusing on corpus annotation, the machine learning model, model training, evaluation, and dissemination. A step-by-step tutorial of this process via the spaCy Python package is provided. The paper highlights the feasibility of developing custom NLP tools to enhance the scalability and replicability of the annotation of context-sensitive linguistic features in L2 writing research.</p></div>\",\"PeriodicalId\":101075,\"journal\":{\"name\":\"Research Methods in Applied Linguistics\",\"volume\":\"3 3\",\"pages\":\"Article 100153\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772766124000594/pdfft?md5=6448551cb4500b4275ce41c9341843c4&pid=1-s2.0-S2772766124000594-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Research Methods in Applied Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772766124000594\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research Methods in Applied Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772766124000594","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Building custom NLP tools to annotate discourse-functional features for second language writing research: A tutorial
The current tutorial paper describes a process of developing a custom natural language processing model with a particular focus on a discourse annotation task. After an overview of recent developments in natural language processing (NLP), the paper discusses the development of the Engagement Analyzer (Eguchi & Kyle, 2023), focusing on corpus annotation, the machine learning model, model training, evaluation, and dissemination. A step-by-step tutorial of this process via the spaCy Python package is provided. The paper highlights the feasibility of developing custom NLP tools to enhance the scalability and replicability of the annotation of context-sensitive linguistic features in L2 writing research.