I Lopez-Gazpio, J Gaviria, P García, H Sanjurjo-González, B Sanz, A Zarranz, M Maritxalar, E Agirre
{"title":"PhrasIS:短语推理和相似性基准","authors":"I Lopez-Gazpio, J Gaviria, P García, H Sanjurjo-González, B Sanz, A Zarranz, M Maritxalar, E Agirre","doi":"10.1093/jigpal/jzae037","DOIUrl":null,"url":null,"abstract":"We present PhrasIS, a benchmark dataset composed of natural occurring Phrase pairs with Inference and Similarity annotations for the evaluation of semantic representations. The described dataset fills the gap between word and sentence-level datasets, allowing to evaluate compositional models at a finer granularity than sentences. Contrary to other datasets, the phrase pairs are extracted from naturally occurring text in image captions and news headlines. All the text fragments have been annotated by experts following a rigorous process also described in the manuscript achieving high inter annotator agreement. In this work we analyse the dataset, showing the relation between inference labels and similarity scores. With 10K phrase pairs split in development and test, the dataset is an excellent benchmark for testing meaning representation systems.","PeriodicalId":51114,"journal":{"name":"Logic Journal of the IGPL","volume":"22 1","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2024-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PhrasIS: Phrase Inference and Similarity benchmark\",\"authors\":\"I Lopez-Gazpio, J Gaviria, P García, H Sanjurjo-González, B Sanz, A Zarranz, M Maritxalar, E Agirre\",\"doi\":\"10.1093/jigpal/jzae037\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present PhrasIS, a benchmark dataset composed of natural occurring Phrase pairs with Inference and Similarity annotations for the evaluation of semantic representations. The described dataset fills the gap between word and sentence-level datasets, allowing to evaluate compositional models at a finer granularity than sentences. Contrary to other datasets, the phrase pairs are extracted from naturally occurring text in image captions and news headlines. All the text fragments have been annotated by experts following a rigorous process also described in the manuscript achieving high inter annotator agreement. In this work we analyse the dataset, showing the relation between inference labels and similarity scores. With 10K phrase pairs split in development and test, the dataset is an excellent benchmark for testing meaning representation systems.\",\"PeriodicalId\":51114,\"journal\":{\"name\":\"Logic Journal of the IGPL\",\"volume\":\"22 1\",\"pages\":\"\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2024-04-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Logic Journal of the IGPL\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1093/jigpal/jzae037\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"LOGIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Logic Journal of the IGPL","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/jigpal/jzae037","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"LOGIC","Score":null,"Total":0}
PhrasIS: Phrase Inference and Similarity benchmark
We present PhrasIS, a benchmark dataset composed of natural occurring Phrase pairs with Inference and Similarity annotations for the evaluation of semantic representations. The described dataset fills the gap between word and sentence-level datasets, allowing to evaluate compositional models at a finer granularity than sentences. Contrary to other datasets, the phrase pairs are extracted from naturally occurring text in image captions and news headlines. All the text fragments have been annotated by experts following a rigorous process also described in the manuscript achieving high inter annotator agreement. In this work we analyse the dataset, showing the relation between inference labels and similarity scores. With 10K phrase pairs split in development and test, the dataset is an excellent benchmark for testing meaning representation systems.
期刊介绍:
Logic Journal of the IGPL publishes papers in all areas of pure and applied logic, including pure logical systems, proof theory, model theory, recursion theory, type theory, nonclassical logics, nonmonotonic logic, numerical and uncertainty reasoning, logic and AI, foundations of logic programming, logic and computation, logic and language, and logic engineering.
Logic Journal of the IGPL is published under licence from Professor Dov Gabbay as owner of the journal.