{"title":"从非结构化文本数据中提取信息,生成原点-目的地矩阵","authors":"Mohamed Mejri, S. Turki, S. Faiz","doi":"10.1109/ICTIA.2014.7883763","DOIUrl":null,"url":null,"abstract":"In this paper, we present an approach for the production of origin destination matrices by extracting information from unstructured textual data of websites. This approach, which we called “Origin Destination Matrix Extractor” is based on three main modules of Information Extraction: an extraction of events module with which we tried to extract any travel events contained in a given text, a named entity recognition module for the recovery and detection of named entities that correspond to the different target information and finally a dependency syntactic analysis module to check the existence of interdependencies between extracted entities and detected travel events. The experiments carried out on a set of real data show that the proposed method gives satisfactory results with a precision of over 90%.","PeriodicalId":390925,"journal":{"name":"2014 Information and Communication Technologies Innovation and Application (ICTIA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Production of origin destination matrix by extracting information from unstructured textual data\",\"authors\":\"Mohamed Mejri, S. Turki, S. Faiz\",\"doi\":\"10.1109/ICTIA.2014.7883763\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present an approach for the production of origin destination matrices by extracting information from unstructured textual data of websites. This approach, which we called “Origin Destination Matrix Extractor” is based on three main modules of Information Extraction: an extraction of events module with which we tried to extract any travel events contained in a given text, a named entity recognition module for the recovery and detection of named entities that correspond to the different target information and finally a dependency syntactic analysis module to check the existence of interdependencies between extracted entities and detected travel events. The experiments carried out on a set of real data show that the proposed method gives satisfactory results with a precision of over 90%.\",\"PeriodicalId\":390925,\"journal\":{\"name\":\"2014 Information and Communication Technologies Innovation and Application (ICTIA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 Information and Communication Technologies Innovation and Application (ICTIA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTIA.2014.7883763\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 Information and Communication Technologies Innovation and Application (ICTIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTIA.2014.7883763","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Production of origin destination matrix by extracting information from unstructured textual data
In this paper, we present an approach for the production of origin destination matrices by extracting information from unstructured textual data of websites. This approach, which we called “Origin Destination Matrix Extractor” is based on three main modules of Information Extraction: an extraction of events module with which we tried to extract any travel events contained in a given text, a named entity recognition module for the recovery and detection of named entities that correspond to the different target information and finally a dependency syntactic analysis module to check the existence of interdependencies between extracted entities and detected travel events. The experiments carried out on a set of real data show that the proposed method gives satisfactory results with a precision of over 90%.