Vaibhav Sharma, Benjamin Shpringer, S. Yang, M. Bolger, Sodiq Adewole, D. Brown, Erfaneh Gharavi
{"title":"建立自由反应训练模拟的数据收集方法","authors":"Vaibhav Sharma, Benjamin Shpringer, S. Yang, M. Bolger, Sodiq Adewole, D. Brown, Erfaneh Gharavi","doi":"10.1109/SIEDS.2019.8735621","DOIUrl":null,"url":null,"abstract":"Most past research in the area of serious games for simulation has focused on games with constrained multiple-choice based dialogue systems. Recent advancements in natural language processing research make free-input text classification-based dialogue systems more feasible, but an effective framework for collecting training data for such systems has not yet been developed. This paper presents methods for collecting and generating data for training a free-input classification-based system. Various data crowdsourcing prompt types are presented. A binary category system, which increases the fidelity of the labeling to make free-input classification more effective, is presented. Finally, a data generation algorithm based on the binary data labeling system is presented. Future work will use the data crowdsourcing and generation methods presented here to implement a free-input dialogue system in a virtual reality (VR) simulation designed for cultural competency training.","PeriodicalId":265421,"journal":{"name":"2019 Systems and Information Engineering Design Symposium (SIEDS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Data Collection Methods for Building a Free Response Training Simulation\",\"authors\":\"Vaibhav Sharma, Benjamin Shpringer, S. Yang, M. Bolger, Sodiq Adewole, D. Brown, Erfaneh Gharavi\",\"doi\":\"10.1109/SIEDS.2019.8735621\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most past research in the area of serious games for simulation has focused on games with constrained multiple-choice based dialogue systems. Recent advancements in natural language processing research make free-input text classification-based dialogue systems more feasible, but an effective framework for collecting training data for such systems has not yet been developed. This paper presents methods for collecting and generating data for training a free-input classification-based system. Various data crowdsourcing prompt types are presented. A binary category system, which increases the fidelity of the labeling to make free-input classification more effective, is presented. Finally, a data generation algorithm based on the binary data labeling system is presented. Future work will use the data crowdsourcing and generation methods presented here to implement a free-input dialogue system in a virtual reality (VR) simulation designed for cultural competency training.\",\"PeriodicalId\":265421,\"journal\":{\"name\":\"2019 Systems and Information Engineering Design Symposium (SIEDS)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 Systems and Information Engineering Design Symposium (SIEDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIEDS.2019.8735621\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Systems and Information Engineering Design Symposium (SIEDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIEDS.2019.8735621","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data Collection Methods for Building a Free Response Training Simulation
Most past research in the area of serious games for simulation has focused on games with constrained multiple-choice based dialogue systems. Recent advancements in natural language processing research make free-input text classification-based dialogue systems more feasible, but an effective framework for collecting training data for such systems has not yet been developed. This paper presents methods for collecting and generating data for training a free-input classification-based system. Various data crowdsourcing prompt types are presented. A binary category system, which increases the fidelity of the labeling to make free-input classification more effective, is presented. Finally, a data generation algorithm based on the binary data labeling system is presented. Future work will use the data crowdsourcing and generation methods presented here to implement a free-input dialogue system in a virtual reality (VR) simulation designed for cultural competency training.