{"title":"Croatian text summarizer (CROSUM)","authors":"T. Lauc, N. Mikelić, D. Boras","doi":"10.1109/ITI.2005.1491199","DOIUrl":null,"url":null,"abstract":"The paper describes automatic summarization of the scientific papers in Croatian language. The goal of the CROSUM is to generate extracts with high percent of extract- worthiness and about the same size as the author's abstract. This preliminary research shows that extracts generated using the lemmatized wordforms dictionary are not quite different from extracts that are given on the base of the non-lemmatized wordforms dictionary. The research brought us to conclusion that we should develop a technique for identifying cue phrases from training corpus or some linguistic technique in order to improve the text summarization for Croatian language.","PeriodicalId":392003,"journal":{"name":"27th International Conference on Information Technology Interfaces, 2005.","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"27th International Conference on Information Technology Interfaces, 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITI.2005.1491199","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The paper describes automatic summarization of the scientific papers in Croatian language. The goal of the CROSUM is to generate extracts with high percent of extract- worthiness and about the same size as the author's abstract. This preliminary research shows that extracts generated using the lemmatized wordforms dictionary are not quite different from extracts that are given on the base of the non-lemmatized wordforms dictionary. The research brought us to conclusion that we should develop a technique for identifying cue phrases from training corpus or some linguistic technique in order to improve the text summarization for Croatian language.