Yifan Guo, David Brock, Alicia Lin, Tam Doan, Ali Khan, Paul Tarau
{"title":"摘要和关键字提取的依赖图:我们提出了一种利用统一依赖图的实时长文档摘要和关键字提取算法。","authors":"Yifan Guo, David Brock, Alicia Lin, Tam Doan, Ali Khan, Paul Tarau","doi":"10.1145/3582768.3582792","DOIUrl":null,"url":null,"abstract":"We introduce a graph-based summarization and keyphrase extraction system that uses dependency trees as inputs for building a document graph. The document graph is built by connecting nodes containing lemmas and sentence identifiers after redirecting dependency links to emphasize semantically important entities. After applying a ranking algorithm to the document graph, we extract the highest ranked sentences as the summary. At the same time, the highest ranked lemmas are aggregated into keyphrases using their context in the dependency graph. Our algorithm specializes in handling long documents, including scientific, technical, legal, and medical documents.","PeriodicalId":315721,"journal":{"name":"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dependency Graphs for Summarization and Keyphrase Extraction: We present a real-time long document summarization and key-phrase extraction algorithm that utilizes a unified dependency graph.\",\"authors\":\"Yifan Guo, David Brock, Alicia Lin, Tam Doan, Ali Khan, Paul Tarau\",\"doi\":\"10.1145/3582768.3582792\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce a graph-based summarization and keyphrase extraction system that uses dependency trees as inputs for building a document graph. The document graph is built by connecting nodes containing lemmas and sentence identifiers after redirecting dependency links to emphasize semantically important entities. After applying a ranking algorithm to the document graph, we extract the highest ranked sentences as the summary. At the same time, the highest ranked lemmas are aggregated into keyphrases using their context in the dependency graph. Our algorithm specializes in handling long documents, including scientific, technical, legal, and medical documents.\",\"PeriodicalId\":315721,\"journal\":{\"name\":\"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval\",\"volume\":\"74 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3582768.3582792\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3582768.3582792","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Dependency Graphs for Summarization and Keyphrase Extraction: We present a real-time long document summarization and key-phrase extraction algorithm that utilizes a unified dependency graph.
We introduce a graph-based summarization and keyphrase extraction system that uses dependency trees as inputs for building a document graph. The document graph is built by connecting nodes containing lemmas and sentence identifiers after redirecting dependency links to emphasize semantically important entities. After applying a ranking algorithm to the document graph, we extract the highest ranked sentences as the summary. At the same time, the highest ranked lemmas are aggregated into keyphrases using their context in the dependency graph. Our algorithm specializes in handling long documents, including scientific, technical, legal, and medical documents.