用核心词汇简化句子

T. Maruyama, Kazuhide Yamamoto
{"title":"用核心词汇简化句子","authors":"T. Maruyama, Kazuhide Yamamoto","doi":"10.1109/IALP.2017.8300618","DOIUrl":null,"url":null,"abstract":"We attempt automatic text simplification with vocabulary restriction on the output side using a machine translation approach based on a simplified corpus that we built. This is the first machine translation approach in Japanese because no Japanese simplification corpus has been created to date. This corpus focuses only on paraphrases of sentence units and phrase units. It is the first time that this type of simplification has been used with such a corpus. This approach makes it possible to simplify better than existing systems do. We also compared models that changed the quantity and quality of the training data and development data. The result shows that data having a medium S-BLEU score between the original sentence and a simple sentence is most effective for automatic text simplification by a machine translation approach.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Sentence simplification with core vocabulary\",\"authors\":\"T. Maruyama, Kazuhide Yamamoto\",\"doi\":\"10.1109/IALP.2017.8300618\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We attempt automatic text simplification with vocabulary restriction on the output side using a machine translation approach based on a simplified corpus that we built. This is the first machine translation approach in Japanese because no Japanese simplification corpus has been created to date. This corpus focuses only on paraphrases of sentence units and phrase units. It is the first time that this type of simplification has been used with such a corpus. This approach makes it possible to simplify better than existing systems do. We also compared models that changed the quantity and quality of the training data and development data. The result shows that data having a medium S-BLEU score between the original sentence and a simple sentence is most effective for automatic text simplification by a machine translation approach.\",\"PeriodicalId\":183586,\"journal\":{\"name\":\"2017 International Conference on Asian Language Processing (IALP)\",\"volume\":\"58 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Asian Language Processing (IALP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IALP.2017.8300618\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Asian Language Processing (IALP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IALP.2017.8300618","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

我们尝试在输出端使用基于我们构建的简化语料库的机器翻译方法进行自动文本简化和词汇限制。这是日语中的第一个机器翻译方法,因为迄今为止还没有创建日语简化语料库。这个语料库只关注句子单位和短语单位的释义。这是第一次在这样的语料库中使用这种简化。这种方法可以比现有的系统更好地进行简化。我们还比较了改变训练数据和发展数据数量和质量的模型。结果表明,在原始句子和简单句子之间具有中等S-BLEU分数的数据对机器翻译方法的文本自动简化最为有效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Sentence simplification with core vocabulary
We attempt automatic text simplification with vocabulary restriction on the output side using a machine translation approach based on a simplified corpus that we built. This is the first machine translation approach in Japanese because no Japanese simplification corpus has been created to date. This corpus focuses only on paraphrases of sentence units and phrase units. It is the first time that this type of simplification has been used with such a corpus. This approach makes it possible to simplify better than existing systems do. We also compared models that changed the quantity and quality of the training data and development data. The result shows that data having a medium S-BLEU score between the original sentence and a simple sentence is most effective for automatic text simplification by a machine translation approach.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信