{"title":"Juris2vec: Building Word Embeddings from Philippine Jurisprudence","authors":"Elmer C. Peramo, C. Cheng, M. Cordel","doi":"10.1109/ICAIIC51459.2021.9415251","DOIUrl":null,"url":null,"abstract":"In this research, we trained nine word embedding models on a large corpus containing Philippine Supreme Court decisions, resolutions, and opinions from 1901 through 2020. We evaluated their performance in terms of accuracy on a customized 4,510-question word analogy test set in seven syntactic and semantic categories. Word2vec models fared better on semantic evaluators while fastText models were more impressive on syntactic evaluators. We also compared our word vector models to another trained on a large legal corpus from other countries.","PeriodicalId":432977,"journal":{"name":"2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIIC51459.2021.9415251","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In this research, we trained nine word embedding models on a large corpus containing Philippine Supreme Court decisions, resolutions, and opinions from 1901 through 2020. We evaluated their performance in terms of accuracy on a customized 4,510-question word analogy test set in seven syntactic and semantic categories. Word2vec models fared better on semantic evaluators while fastText models were more impressive on syntactic evaluators. We also compared our word vector models to another trained on a large legal corpus from other countries.