{"title":"阿拉伯文献分析方法综述","authors":"Hassina Bouressace","doi":"10.1109/PAIS56586.2022.9946919","DOIUrl":null,"url":null,"abstract":"Arabic document analysis is essential in geometrical information extraction from complex structures in Arabic documents, which can either be historical or modern. This information can be an organized tree structure containing all the component levels, such as column, paragraph, word, table, figure, and article. In this paper, we provide an analysis of recent works on this topic from various perspectives, describing the most commonly used models on document physical layout detection and document logical structure representations in printed styles, summarizing the limitations of previous approaches, identifying challenges along this line of research, and providing new research directions for future algorithms.","PeriodicalId":266229,"journal":{"name":"2022 4th International Conference on Pattern Analysis and Intelligent Systems (PAIS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Review of Arabic Document Analysis Methods\",\"authors\":\"Hassina Bouressace\",\"doi\":\"10.1109/PAIS56586.2022.9946919\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Arabic document analysis is essential in geometrical information extraction from complex structures in Arabic documents, which can either be historical or modern. This information can be an organized tree structure containing all the component levels, such as column, paragraph, word, table, figure, and article. In this paper, we provide an analysis of recent works on this topic from various perspectives, describing the most commonly used models on document physical layout detection and document logical structure representations in printed styles, summarizing the limitations of previous approaches, identifying challenges along this line of research, and providing new research directions for future algorithms.\",\"PeriodicalId\":266229,\"journal\":{\"name\":\"2022 4th International Conference on Pattern Analysis and Intelligent Systems (PAIS)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 4th International Conference on Pattern Analysis and Intelligent Systems (PAIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PAIS56586.2022.9946919\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 4th International Conference on Pattern Analysis and Intelligent Systems (PAIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PAIS56586.2022.9946919","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Arabic document analysis is essential in geometrical information extraction from complex structures in Arabic documents, which can either be historical or modern. This information can be an organized tree structure containing all the component levels, such as column, paragraph, word, table, figure, and article. In this paper, we provide an analysis of recent works on this topic from various perspectives, describing the most commonly used models on document physical layout detection and document logical structure representations in printed styles, summarizing the limitations of previous approaches, identifying challenges along this line of research, and providing new research directions for future algorithms.