Patent claim decomposition for improved information extraction

Current Challenges in Patent Information Retrieval Pub Date : 2009-11-06 DOI:10.1145/1651343.1651351

Peter Parapatics, M. Dittenbach

引用次数: 28

Abstract

In several application domains research in natural language processing and information extraction has spawned valuable tools that support humans in structuring, aggregating and managing large amounts of information available as text. Patent claims, although subject to a number of rigid constraints and therefore forced into foreseeable structures, are written in a language even good parsing algorithms tend to fail miserably at. This is primarily caused by long and complex sentences that are a concatenation of a multitude of descriptive elements. We present an approach to split patent claims into several parts in order to improve parsing performance for further automatic processing.

查看原文本刊更多论文

专利权利要求分解以改进信息提取

在一些应用领域，对自然语言处理和信息提取的研究已经催生了一些有价值的工具，这些工具支持人类构建、聚合和管理大量可用的文本信息。专利权利要求虽然受到许多严格的约束，因此被迫进入可预见的结构，但它们是用一种即使是好的解析算法也容易失败的语言编写的。这主要是由长而复杂的句子引起的，这些句子是由大量描述性元素串联而成的。我们提出了一种将专利权利要求分割成几个部分的方法，以提高进一步自动处理的解析性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Current Challenges in Patent Information Retrieval

自引率

0.00%

发文量