{"title":"基于互信息的数字图书馆元数据提取","authors":"Lizhen Liu, Guoqiang He, Xuling Shi, Hantao Song","doi":"10.1109/ISITAE.2007.4409272","DOIUrl":null,"url":null,"abstract":"As the main infrastructure of Internet-two, digital library have had a rapidly development and received a lot of harvest in recent years. But one of the key problems is how to help users to find satisfied resources more efficiently among the affluent contents in heterogeneous repositories of digital libraries. Metadata as a kind of structure data about data can describe the content, semantics and services of data. Metadata, which is a foundation of defining and organizing the resources in digital library, plays a pivotal role in constructing resources. Therefore, metadata extraction, semantic retrieval and semantic annotate in metadata automatic management are challengeable research tasks. Each kind of metadata could be regarded as a classification. Therefore, metadata extraction is just as the classifying work for every document block. The paper focused on the research of automatic metadata extraction based on mutual information which is a widely used information theoretic measure, in a descriptive way, to compute the stochastic dependency of discrete random variables. Metadata extraction has been performed using max-mutual information including linear and non-linear feature conversions. Entropy is made use of and extended to find right features commendably in digital library systems.","PeriodicalId":332503,"journal":{"name":"2007 First IEEE International Symposium on Information Technologies and Applications in Education","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Metadata Extraction Based on Mutual Information in Digital Libraries\",\"authors\":\"Lizhen Liu, Guoqiang He, Xuling Shi, Hantao Song\",\"doi\":\"10.1109/ISITAE.2007.4409272\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the main infrastructure of Internet-two, digital library have had a rapidly development and received a lot of harvest in recent years. But one of the key problems is how to help users to find satisfied resources more efficiently among the affluent contents in heterogeneous repositories of digital libraries. Metadata as a kind of structure data about data can describe the content, semantics and services of data. Metadata, which is a foundation of defining and organizing the resources in digital library, plays a pivotal role in constructing resources. Therefore, metadata extraction, semantic retrieval and semantic annotate in metadata automatic management are challengeable research tasks. Each kind of metadata could be regarded as a classification. Therefore, metadata extraction is just as the classifying work for every document block. The paper focused on the research of automatic metadata extraction based on mutual information which is a widely used information theoretic measure, in a descriptive way, to compute the stochastic dependency of discrete random variables. Metadata extraction has been performed using max-mutual information including linear and non-linear feature conversions. Entropy is made use of and extended to find right features commendably in digital library systems.\",\"PeriodicalId\":332503,\"journal\":{\"name\":\"2007 First IEEE International Symposium on Information Technologies and Applications in Education\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-12-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 First IEEE International Symposium on Information Technologies and Applications in Education\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISITAE.2007.4409272\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 First IEEE International Symposium on Information Technologies and Applications in Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISITAE.2007.4409272","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Metadata Extraction Based on Mutual Information in Digital Libraries
As the main infrastructure of Internet-two, digital library have had a rapidly development and received a lot of harvest in recent years. But one of the key problems is how to help users to find satisfied resources more efficiently among the affluent contents in heterogeneous repositories of digital libraries. Metadata as a kind of structure data about data can describe the content, semantics and services of data. Metadata, which is a foundation of defining and organizing the resources in digital library, plays a pivotal role in constructing resources. Therefore, metadata extraction, semantic retrieval and semantic annotate in metadata automatic management are challengeable research tasks. Each kind of metadata could be regarded as a classification. Therefore, metadata extraction is just as the classifying work for every document block. The paper focused on the research of automatic metadata extraction based on mutual information which is a widely used information theoretic measure, in a descriptive way, to compute the stochastic dependency of discrete random variables. Metadata extraction has been performed using max-mutual information including linear and non-linear feature conversions. Entropy is made use of and extended to find right features commendably in digital library systems.