基于机器学习的相似法律文件跨语言检索方法

IF 0.4 Q4 INFORMATION SCIENCE & LIBRARY SCIENCE
V. V. Zhebel, D. A. Devyatkin, D. V. Zubarev, I. V. Sochenkov
{"title":"基于机器学习的相似法律文件跨语言检索方法","authors":"V. V. Zhebel, D. A. Devyatkin, D. V. Zubarev, I. V. Sochenkov","doi":"10.3103/s0147688223050167","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">\n<b>Abstract</b>—</h3><p>In order to study global experience for legislation changing and rule-making necessitates, tools for information retrieval of regulatory documents written in different languages become increasingly necessary. One of the aspects of information identification is retrieval of thematically similar documents for a given input document. In this context, an important task of cross-lingual search arises when the user of an information system specifies a reference document in one language, and the search results contain relevant documents in other languages. The article describes different approaches to solving this problem: from classic mediator-based methods to more modern solutions, based on distributional semantics. The test collection used in the study was taken from the United Nations Digital Library, which provides legal documents in both the original English and their Russian translations.</p>","PeriodicalId":43962,"journal":{"name":"Scientific and Technical Information Processing","volume":"203 1","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Approaches to Cross-Language Retrieval of Similar Legal Documents Based on Machine Learning\",\"authors\":\"V. V. Zhebel, D. A. Devyatkin, D. V. Zubarev, I. V. Sochenkov\",\"doi\":\"10.3103/s0147688223050167\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3 data-test=\\\"abstract-sub-heading\\\">\\n<b>Abstract</b>—</h3><p>In order to study global experience for legislation changing and rule-making necessitates, tools for information retrieval of regulatory documents written in different languages become increasingly necessary. One of the aspects of information identification is retrieval of thematically similar documents for a given input document. In this context, an important task of cross-lingual search arises when the user of an information system specifies a reference document in one language, and the search results contain relevant documents in other languages. The article describes different approaches to solving this problem: from classic mediator-based methods to more modern solutions, based on distributional semantics. The test collection used in the study was taken from the United Nations Digital Library, which provides legal documents in both the original English and their Russian translations.</p>\",\"PeriodicalId\":43962,\"journal\":{\"name\":\"Scientific and Technical Information Processing\",\"volume\":\"203 1\",\"pages\":\"\"},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2024-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific and Technical Information Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3103/s0147688223050167\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific and Technical Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3103/s0147688223050167","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0

摘要

摘要--为了研究全球立法变化和规则制定的经验,越来越需要对以不同语言编写的规范性文件进行信息检索的工具。信息识别的一个方面是检索给定输入文件的主题相似文件。在这种情况下,当信息系统用户指定一种语言的参考文件,而搜索结果包含其他语言的相关文件时,就会出现跨语言搜索的重要任务。文章介绍了解决这一问题的不同方法:从基于中介的经典方法到基于分布语义的更现代的解决方案。研究中使用的测试集合来自联合国数字图书馆,该图书馆提供法律文件的英文原文和俄文译文。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Approaches to Cross-Language Retrieval of Similar Legal Documents Based on Machine Learning

Abstract

In order to study global experience for legislation changing and rule-making necessitates, tools for information retrieval of regulatory documents written in different languages become increasingly necessary. One of the aspects of information identification is retrieval of thematically similar documents for a given input document. In this context, an important task of cross-lingual search arises when the user of an information system specifies a reference document in one language, and the search results contain relevant documents in other languages. The article describes different approaches to solving this problem: from classic mediator-based methods to more modern solutions, based on distributional semantics. The test collection used in the study was taken from the United Nations Digital Library, which provides legal documents in both the original English and their Russian translations.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Scientific and Technical Information Processing
Scientific and Technical Information Processing INFORMATION SCIENCE & LIBRARY SCIENCE-
CiteScore
1.00
自引率
42.90%
发文量
20
期刊介绍: Scientific and Technical Information Processing  is a refereed journal that covers all aspects of management and use of information technology in libraries and archives, information centres, and the information industry in general. Emphasis is on practical applications of new technologies and techniques for information analysis and processing.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信