{"title":"用核心词汇简化句子","authors":"T. Maruyama, Kazuhide Yamamoto","doi":"10.1109/IALP.2017.8300618","DOIUrl":null,"url":null,"abstract":"We attempt automatic text simplification with vocabulary restriction on the output side using a machine translation approach based on a simplified corpus that we built. This is the first machine translation approach in Japanese because no Japanese simplification corpus has been created to date. This corpus focuses only on paraphrases of sentence units and phrase units. It is the first time that this type of simplification has been used with such a corpus. This approach makes it possible to simplify better than existing systems do. We also compared models that changed the quantity and quality of the training data and development data. The result shows that data having a medium S-BLEU score between the original sentence and a simple sentence is most effective for automatic text simplification by a machine translation approach.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Sentence simplification with core vocabulary\",\"authors\":\"T. Maruyama, Kazuhide Yamamoto\",\"doi\":\"10.1109/IALP.2017.8300618\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We attempt automatic text simplification with vocabulary restriction on the output side using a machine translation approach based on a simplified corpus that we built. This is the first machine translation approach in Japanese because no Japanese simplification corpus has been created to date. This corpus focuses only on paraphrases of sentence units and phrase units. It is the first time that this type of simplification has been used with such a corpus. This approach makes it possible to simplify better than existing systems do. We also compared models that changed the quantity and quality of the training data and development data. The result shows that data having a medium S-BLEU score between the original sentence and a simple sentence is most effective for automatic text simplification by a machine translation approach.\",\"PeriodicalId\":183586,\"journal\":{\"name\":\"2017 International Conference on Asian Language Processing (IALP)\",\"volume\":\"58 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Asian Language Processing (IALP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IALP.2017.8300618\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Asian Language Processing (IALP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IALP.2017.8300618","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
We attempt automatic text simplification with vocabulary restriction on the output side using a machine translation approach based on a simplified corpus that we built. This is the first machine translation approach in Japanese because no Japanese simplification corpus has been created to date. This corpus focuses only on paraphrases of sentence units and phrase units. It is the first time that this type of simplification has been used with such a corpus. This approach makes it possible to simplify better than existing systems do. We also compared models that changed the quantity and quality of the training data and development data. The result shows that data having a medium S-BLEU score between the original sentence and a simple sentence is most effective for automatic text simplification by a machine translation approach.