Rachna Tewani, Achin Jain, Eshika Agarwal, Disha Mittal, A. Dubey
{"title":"有效地提取了印度政府颁发的Aadhar和Pan卡文件的信息","authors":"Rachna Tewani, Achin Jain, Eshika Agarwal, Disha Mittal, A. Dubey","doi":"10.54216/fpa.040201","DOIUrl":null,"url":null,"abstract":"In today's world, everything is getting digitized, and widespread use of data scanning tools and photography. When we have a lot of image data, it becomes important to accumulate data in a form that is useful for the company/organization. Doing it manually is a tedious task and takes an ample amount of time. Hence to simplify the job, we have developed a FLASK API that takes an image folder as an object and returns an excel sheet of relevant data from the image data. We have used optical character recognition and software like pytesseract to extract data from images. Further in the process, we have used natural language processing, and finally, we have found relevant data using the globe and regex module. This model is helpful in data collection from Registration certificates which helps us store data like chassis number, owner name, car number, etc., easily and can be applied to Aadhaar cards and pan cards.","PeriodicalId":269527,"journal":{"name":"Fusion: Practice and Applications","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An efficient extraction of information from Indian Government issued documents Aadhar and Pan Card\",\"authors\":\"Rachna Tewani, Achin Jain, Eshika Agarwal, Disha Mittal, A. Dubey\",\"doi\":\"10.54216/fpa.040201\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In today's world, everything is getting digitized, and widespread use of data scanning tools and photography. When we have a lot of image data, it becomes important to accumulate data in a form that is useful for the company/organization. Doing it manually is a tedious task and takes an ample amount of time. Hence to simplify the job, we have developed a FLASK API that takes an image folder as an object and returns an excel sheet of relevant data from the image data. We have used optical character recognition and software like pytesseract to extract data from images. Further in the process, we have used natural language processing, and finally, we have found relevant data using the globe and regex module. This model is helpful in data collection from Registration certificates which helps us store data like chassis number, owner name, car number, etc., easily and can be applied to Aadhaar cards and pan cards.\",\"PeriodicalId\":269527,\"journal\":{\"name\":\"Fusion: Practice and Applications\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fusion: Practice and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.54216/fpa.040201\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fusion: Practice and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54216/fpa.040201","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An efficient extraction of information from Indian Government issued documents Aadhar and Pan Card
In today's world, everything is getting digitized, and widespread use of data scanning tools and photography. When we have a lot of image data, it becomes important to accumulate data in a form that is useful for the company/organization. Doing it manually is a tedious task and takes an ample amount of time. Hence to simplify the job, we have developed a FLASK API that takes an image folder as an object and returns an excel sheet of relevant data from the image data. We have used optical character recognition and software like pytesseract to extract data from images. Further in the process, we have used natural language processing, and finally, we have found relevant data using the globe and regex module. This model is helpful in data collection from Registration certificates which helps us store data like chassis number, owner name, car number, etc., easily and can be applied to Aadhaar cards and pan cards.