{"title":"科技摘要中的动作特征词","authors":"Kiyota Hashimoto, T. Soonklang, S. Hirokawa","doi":"10.1109/IIAI-AAI.2016.38","DOIUrl":null,"url":null,"abstract":"Extraction of structure from texts is a key issue of text mining. The rhetorical structure of move in scientific articles is useful for assisting in the reading and writing. In this paper, we classify move structure in the abstract of research articles with a small number of characteristic words that determine five moves of including background (B), purpose(P), method(M), result(R) and discussion(D). Eleven measures were introduced and used to select features of moves. Exhaustive parameter search were conducted to get the optimal combination of measure and the number of features. We applied support vector machine and evaluated 10 fold cross validations. The accuracies with optimal feature selections are 0.9022, 0.8322, 0.8442, 0.8820 and 0.8354 for B, P, M, R and D, respectively. They are 10% better than the baseline performance that use all keywords. This study surprisedly found that the negative feature words play central role for prediction performance improvement.","PeriodicalId":272739,"journal":{"name":"2016 5th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Feature Words of Moves in Scientific Abstracts\",\"authors\":\"Kiyota Hashimoto, T. Soonklang, S. Hirokawa\",\"doi\":\"10.1109/IIAI-AAI.2016.38\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Extraction of structure from texts is a key issue of text mining. The rhetorical structure of move in scientific articles is useful for assisting in the reading and writing. In this paper, we classify move structure in the abstract of research articles with a small number of characteristic words that determine five moves of including background (B), purpose(P), method(M), result(R) and discussion(D). Eleven measures were introduced and used to select features of moves. Exhaustive parameter search were conducted to get the optimal combination of measure and the number of features. We applied support vector machine and evaluated 10 fold cross validations. The accuracies with optimal feature selections are 0.9022, 0.8322, 0.8442, 0.8820 and 0.8354 for B, P, M, R and D, respectively. They are 10% better than the baseline performance that use all keywords. This study surprisedly found that the negative feature words play central role for prediction performance improvement.\",\"PeriodicalId\":272739,\"journal\":{\"name\":\"2016 5th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 5th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IIAI-AAI.2016.38\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 5th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IIAI-AAI.2016.38","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Extraction of structure from texts is a key issue of text mining. The rhetorical structure of move in scientific articles is useful for assisting in the reading and writing. In this paper, we classify move structure in the abstract of research articles with a small number of characteristic words that determine five moves of including background (B), purpose(P), method(M), result(R) and discussion(D). Eleven measures were introduced and used to select features of moves. Exhaustive parameter search were conducted to get the optimal combination of measure and the number of features. We applied support vector machine and evaluated 10 fold cross validations. The accuracies with optimal feature selections are 0.9022, 0.8322, 0.8442, 0.8820 and 0.8354 for B, P, M, R and D, respectively. They are 10% better than the baseline performance that use all keywords. This study surprisedly found that the negative feature words play central role for prediction performance improvement.