基于链接语法的可解释性自然语言切分

Vignav Ramesh, A. Kolonin
{"title":"基于链接语法的可解释性自然语言切分","authors":"Vignav Ramesh, A. Kolonin","doi":"10.1109/S.A.I.ence50533.2020.9303220","DOIUrl":null,"url":null,"abstract":"Natural language segmentation (NLS), or text segmentation, refers to the process of dividing written text into meaningful units. Sentence segmentation, a subfield of text segmentation, is the problem of dividing a string of natural language text into its component sentences. Current methods of sentence segmentation are often either hardcoded—they require manual implementation of fixed grammar and segmentation rules—or require extensive training on labeled corpora and are not explainable—they are \"black box\" algorithms that cannot be understood by humans. In this paper, we present a novel explainable sentence segmentation method capable of separating bodies of text into grammatically valid sentences solely based on the grammatical relationships between individual words or tokens. The proposed NLS architecture can both automate the input query parsing and semantic query execution components of voice-activated question answering and information retrieval systems as well as enable automatic summarization, entity extraction, sentiment identification, and a variety of other natural language processing (NLP) algorithms that operate at the sentential level.","PeriodicalId":201402,"journal":{"name":"2020 Science and Artificial Intelligence conference (S.A.I.ence)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Interpretable Natural Language Segmentation Based on Link Grammar\",\"authors\":\"Vignav Ramesh, A. Kolonin\",\"doi\":\"10.1109/S.A.I.ence50533.2020.9303220\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Natural language segmentation (NLS), or text segmentation, refers to the process of dividing written text into meaningful units. Sentence segmentation, a subfield of text segmentation, is the problem of dividing a string of natural language text into its component sentences. Current methods of sentence segmentation are often either hardcoded—they require manual implementation of fixed grammar and segmentation rules—or require extensive training on labeled corpora and are not explainable—they are \\\"black box\\\" algorithms that cannot be understood by humans. In this paper, we present a novel explainable sentence segmentation method capable of separating bodies of text into grammatically valid sentences solely based on the grammatical relationships between individual words or tokens. The proposed NLS architecture can both automate the input query parsing and semantic query execution components of voice-activated question answering and information retrieval systems as well as enable automatic summarization, entity extraction, sentiment identification, and a variety of other natural language processing (NLP) algorithms that operate at the sentential level.\",\"PeriodicalId\":201402,\"journal\":{\"name\":\"2020 Science and Artificial Intelligence conference (S.A.I.ence)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 Science and Artificial Intelligence conference (S.A.I.ence)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/S.A.I.ence50533.2020.9303220\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Science and Artificial Intelligence conference (S.A.I.ence)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/S.A.I.ence50533.2020.9303220","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

自然语言分词(NLS),又称文本分词,是指将书面文本分割成有意义的单位的过程。句子切分是文本切分的一个分支,是将一串自然语言文本分割成其组成句子的问题。当前的句子切分方法通常要么是硬编码的——它们需要手动实现固定的语法和切分规则——要么是需要在标记的语料库上进行大量训练,并且无法解释——它们是人类无法理解的“黑匣子”算法。在本文中,我们提出了一种新的可解释的句子切分方法,该方法能够仅基于单个单词或标记之间的语法关系将文本主体分离为语法有效的句子。所提出的NLS架构既可以自动化语音激活问答和信息检索系统的输入查询解析和语义查询执行组件,也可以实现自动摘要、实体提取、情感识别和其他各种在句子级别操作的自然语言处理(NLP)算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Interpretable Natural Language Segmentation Based on Link Grammar
Natural language segmentation (NLS), or text segmentation, refers to the process of dividing written text into meaningful units. Sentence segmentation, a subfield of text segmentation, is the problem of dividing a string of natural language text into its component sentences. Current methods of sentence segmentation are often either hardcoded—they require manual implementation of fixed grammar and segmentation rules—or require extensive training on labeled corpora and are not explainable—they are "black box" algorithms that cannot be understood by humans. In this paper, we present a novel explainable sentence segmentation method capable of separating bodies of text into grammatically valid sentences solely based on the grammatical relationships between individual words or tokens. The proposed NLS architecture can both automate the input query parsing and semantic query execution components of voice-activated question answering and information retrieval systems as well as enable automatic summarization, entity extraction, sentiment identification, and a variety of other natural language processing (NLP) algorithms that operate at the sentential level.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信