{"title":"自动摘要和关键字提取从网页或文本文件","authors":"Xiangdong You","doi":"10.1109/CCET48361.2019.8989315","DOIUrl":null,"url":null,"abstract":"In this paper, we study the automatic summarization and keyword extraction techniques for web page and text file. First, we use the Readability algorithm to extract the text of the web page, and study the PageRank algorithm and TextRank algorithm, and then use the TextRank algorithm to extract keywords, key sentences and abstracts. We also develop the web application that processes web page and text file. The application can input URL, text file, or text paragraph, then application can complete the extraction of main content, abstract, keywords and key sentences.","PeriodicalId":231425,"journal":{"name":"2019 IEEE 2nd International Conference on Computer and Communication Engineering Technology (CCET)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Automatic Summarization and Keyword Extraction from Web Page or Text File\",\"authors\":\"Xiangdong You\",\"doi\":\"10.1109/CCET48361.2019.8989315\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we study the automatic summarization and keyword extraction techniques for web page and text file. First, we use the Readability algorithm to extract the text of the web page, and study the PageRank algorithm and TextRank algorithm, and then use the TextRank algorithm to extract keywords, key sentences and abstracts. We also develop the web application that processes web page and text file. The application can input URL, text file, or text paragraph, then application can complete the extraction of main content, abstract, keywords and key sentences.\",\"PeriodicalId\":231425,\"journal\":{\"name\":\"2019 IEEE 2nd International Conference on Computer and Communication Engineering Technology (CCET)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 2nd International Conference on Computer and Communication Engineering Technology (CCET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCET48361.2019.8989315\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 2nd International Conference on Computer and Communication Engineering Technology (CCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCET48361.2019.8989315","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic Summarization and Keyword Extraction from Web Page or Text File
In this paper, we study the automatic summarization and keyword extraction techniques for web page and text file. First, we use the Readability algorithm to extract the text of the web page, and study the PageRank algorithm and TextRank algorithm, and then use the TextRank algorithm to extract keywords, key sentences and abstracts. We also develop the web application that processes web page and text file. The application can input URL, text file, or text paragraph, then application can complete the extraction of main content, abstract, keywords and key sentences.