基于词性标签的神经网络预测日本司法判例中机密词的方法

IF 1.3 Q4 TELECOMMUNICATIONS

Infocommunications Journal Pub Date : 2020-01-01 DOI:10.36244/icj.2020.1.3

Masakazu Kanazawa, Atsushi Ito, Kazuyuki Yamasawa, Takehiko Kasahara, Yuya Kiryu, Fubito Toyama

{"title":"基于词性标签的神经网络预测日本司法判例中机密词的方法","authors":"Masakazu Kanazawa, Atsushi Ito, Kazuyuki Yamasawa, Takehiko Kasahara, Yuya Kiryu, Fubito Toyama","doi":"10.36244/icj.2020.1.3","DOIUrl":null,"url":null,"abstract":"Abstract—Cognitive Infocommunications involve a combination of informatics and telecommunications. In the future, infocommunication is expected to become more intelligent and life supportive. Privacy is one of the most critical concerns in infocommunications. Encryption is a well-recognized technology that ensures privacy; however, it is not easy to completely hide personal information. One technique to protect privacy is by finding confidential words in a file or a website and changing them into meaningless words. In this paper, we investigate a technology used to hide confidential words taken from judicial precedents. In the Japanese judicial field, details of most precedents are not made available to the public on the Japanese court web pages to protect the persons involved. To ensure privacy, confidential words, such as personal names, are replaced by other meaningless words. This operation takes time and effort because it is done manually. Therefore, it is desirable to automatically predict confidential words. We proposed a method for predicting confidential words in Japanese judicial precedents by using part-of-speech (POS) tagging with neural networks. As a result, we obtained 88% accuracy improvement over a previous model. In this paper, we describe the mechanism of our proposed model and the prediction results using perplexity. Then, we evaluated how our proposed model was useful for the actual precedents by using recall and precision. As a result, our proposed model could detect confidential words in certain Japanese precedents.","PeriodicalId":42504,"journal":{"name":"Infocommunications Journal","volume":"7 1","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Method to Predict Confidential Words in Japanese Judicial Precedents Using Neural Networks With Part-of-Speech Tags\",\"authors\":\"Masakazu Kanazawa, Atsushi Ito, Kazuyuki Yamasawa, Takehiko Kasahara, Yuya Kiryu, Fubito Toyama\",\"doi\":\"10.36244/icj.2020.1.3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract—Cognitive Infocommunications involve a combination of informatics and telecommunications. In the future, infocommunication is expected to become more intelligent and life supportive. Privacy is one of the most critical concerns in infocommunications. Encryption is a well-recognized technology that ensures privacy; however, it is not easy to completely hide personal information. One technique to protect privacy is by finding confidential words in a file or a website and changing them into meaningless words. In this paper, we investigate a technology used to hide confidential words taken from judicial precedents. In the Japanese judicial field, details of most precedents are not made available to the public on the Japanese court web pages to protect the persons involved. To ensure privacy, confidential words, such as personal names, are replaced by other meaningless words. This operation takes time and effort because it is done manually. Therefore, it is desirable to automatically predict confidential words. We proposed a method for predicting confidential words in Japanese judicial precedents by using part-of-speech (POS) tagging with neural networks. As a result, we obtained 88% accuracy improvement over a previous model. In this paper, we describe the mechanism of our proposed model and the prediction results using perplexity. Then, we evaluated how our proposed model was useful for the actual precedents by using recall and precision. As a result, our proposed model could detect confidential words in certain Japanese precedents.\",\"PeriodicalId\":42504,\"journal\":{\"name\":\"Infocommunications Journal\",\"volume\":\"7 1\",\"pages\":\"\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2020-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Infocommunications Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.36244/icj.2020.1.3\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infocommunications Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.36244/icj.2020.1.3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}

引用次数: 2

摘要

摘要-认知信息通信涉及信息学和电信的结合。在未来，信息通信有望变得更加智能化和生活支持性。隐私是信息通信中最重要的问题之一。加密是一种公认的确保隐私的技术;然而，完全隐藏个人信息并不容易。保护隐私的一种方法是在文件或网站中找到机密词，并将其转换为无意义的词。在本文中，我们研究了一种用于隐藏司法判例中的机密词的技术。在日本司法领域，为了保护当事人，大多数判例的细节都没有在日本法院的网页上向公众公布。为了保证隐私，机密词(如人名)会被其他没有意义的词所取代。由于该操作是手动完成的，因此需要花费时间和精力。因此，需要自动预测机密词。本文提出了一种基于神经网络的词性标注的日语司法判例机密词预测方法。结果，我们比以前的模型获得了88%的准确性提高。在本文中，我们描述了我们所提出的模型的机制和使用perplexity的预测结果。然后，我们通过召回率和精确率来评估我们提出的模型对实际案例的有用性。结果表明，我们提出的模型可以检测到某些日本判例中的机密词。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Method to Predict Confidential Words in Japanese Judicial Precedents Using Neural Networks With Part-of-Speech Tags

Abstract—Cognitive Infocommunications involve a combination of informatics and telecommunications. In the future, infocommunication is expected to become more intelligent and life supportive. Privacy is one of the most critical concerns in infocommunications. Encryption is a well-recognized technology that ensures privacy; however, it is not easy to completely hide personal information. One technique to protect privacy is by finding confidential words in a file or a website and changing them into meaningless words. In this paper, we investigate a technology used to hide confidential words taken from judicial precedents. In the Japanese judicial field, details of most precedents are not made available to the public on the Japanese court web pages to protect the persons involved. To ensure privacy, confidential words, such as personal names, are replaced by other meaningless words. This operation takes time and effort because it is done manually. Therefore, it is desirable to automatically predict confidential words. We proposed a method for predicting confidential words in Japanese judicial precedents by using part-of-speech (POS) tagging with neural networks. As a result, we obtained 88% accuracy improvement over a previous model. In this paper, we describe the mechanism of our proposed model and the prediction results using perplexity. Then, we evaluated how our proposed model was useful for the actual precedents by using recall and precision. As a result, our proposed model could detect confidential words in certain Japanese precedents.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Infocommunications Journal TELECOMMUNICATIONS-

CiteScore

1.90

自引率

27.30%

发文量