{"title":"Markov random fields on graphs for natural languages","authors":"J. O’Sullivan, K. Mark, M. Miller","doi":"10.1109/WITS.1994.513880","DOIUrl":"https://doi.org/10.1109/WITS.1994.513880","url":null,"abstract":"The use of model-based methods for data compression for English dates back at least to Shannon's Markov chain (n-gram) models, where the probability of the next word given all previous words equals the probability of the next word given the previous n-1 words. A second approach seeks to model the hierarchical nature of language via tree graph structures arising from a context-free language (CFL). Neither the n-gram nor the CFL models approach the data compression predicted by the entropy of English as estimated by Shannon and Cover and King. This paper presents two models that incorporate the benefits of both the n-gram model and the tree-based models. In either case the neighborhood structure on the syntactic variables is determined by the tree while the neighborhood structure of the words is determined by the n-gram and the parent syntactic variable (preterminal) in the tree, Having both types of neighbors for the words should yield decreased entropy of the model and hence fewer bits per word in data compression. To motivate estimation of model parameters, some results in estimating parameters for random branching processes is reviewed.","PeriodicalId":423518,"journal":{"name":"Proceedings of 1994 Workshop on Information Theory and Statistics","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127157496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Consistency and rates of convergence of k/sub n/ nearest neighbor estimation under arbitrary sampling","authors":"S. Posner, S. R. Kulkarni","doi":"10.1109/WITS.1994.513901","DOIUrl":"https://doi.org/10.1109/WITS.1994.513901","url":null,"abstract":"Consistency and rates of convergence of the k/sub n/-NN estimator are established in the general case in which samples are chosen arbitrarily from a compact metric space.","PeriodicalId":423518,"journal":{"name":"Proceedings of 1994 Workshop on Information Theory and Statistics","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126887135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tracking long-range dependencies with wavelets","authors":"P. Flandrin, P. Abry","doi":"10.1109/WITS.1994.513885","DOIUrl":"https://doi.org/10.1109/WITS.1994.513885","url":null,"abstract":"Long-range dependent processes exhibit features, such as 1/f spectra, for which wavelets offer versatile tools and provide a unifying framework. This efficiency is demonstrated on both continuous processes, point processes and filtered point processes. The fractal shot noise model is also considered.","PeriodicalId":423518,"journal":{"name":"Proceedings of 1994 Workshop on Information Theory and Statistics","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115860351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification via compressed data","authors":"R. Ahlswede, E. Yang, Zhen Zhang","doi":"10.1109/WITS.1994.513869","DOIUrl":"https://doi.org/10.1109/WITS.1994.513869","url":null,"abstract":"A combined problem of source coding and identification is considered. To put the problem in perspective, the authors first review the traditional problem in source coding theory.","PeriodicalId":423518,"journal":{"name":"Proceedings of 1994 Workshop on Information Theory and Statistics","volume":"723 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114903243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The finite-sample risk of the k-nearest-neighbor classifier under the L/sub p/ metric","authors":"R. Snapp, S. S. Venkatesh","doi":"10.1109/WITS.1994.513925","DOIUrl":"https://doi.org/10.1109/WITS.1994.513925","url":null,"abstract":"The finite-sample risk of the k-nearest neighbor classifier that uses an L/sub 2/ distance function is examined. For a family of classification problems with smooth distributions in R/sup n/, the risk can be represented as an asymptotic expansion in inverse powers of the n-th root of the reference-sample size. The leading coefficients of this expansion suggest that the Euclidean or L/sub 2/ distance function minimizes the risk for sufficiently large reference samples.","PeriodicalId":423518,"journal":{"name":"Proceedings of 1994 Workshop on Information Theory and Statistics","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124836170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"New distortion measures for speech processing","authors":"T.-H. Li, J. Gibson","doi":"10.1109/WITS.1994.513919","DOIUrl":"https://doi.org/10.1109/WITS.1994.513919","url":null,"abstract":"New distortion measures are derived from a recently proposed characterization function of stationary time series and are shown to be more robust than some commonly-used distortion measures such as the Kullback-Leibler spectral divergence in speech processing.","PeriodicalId":423518,"journal":{"name":"Proceedings of 1994 Workshop on Information Theory and Statistics","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124853875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the theory and application of universal classification to signal detection","authors":"N. Warke, G.C. Orsak","doi":"10.1109/WITS.1994.513908","DOIUrl":"https://doi.org/10.1109/WITS.1994.513908","url":null,"abstract":"The authors apply methods of universal classification to the problem of classifying one of M deterministic signals in the presence of dependent non-Gaussian noise.","PeriodicalId":423518,"journal":{"name":"Proceedings of 1994 Workshop on Information Theory and Statistics","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127717343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Some estimation problems in infinite dimensional Gaussian white noise","authors":"I. Ibragimov, R. Khasminskii","doi":"10.1109/WITS.1994.513872","DOIUrl":"https://doi.org/10.1109/WITS.1994.513872","url":null,"abstract":"Methods of the information theory and approximation theory are used to obtain the conditions for the existence of consistent estimators for the observations in a Gaussian white noise in a Hilbert space.","PeriodicalId":423518,"journal":{"name":"Proceedings of 1994 Workshop on Information Theory and Statistics","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121068666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Minimal randomness and information theory","authors":"S. Verdú","doi":"10.1109/WITS.1994.513867","DOIUrl":"https://doi.org/10.1109/WITS.1994.513867","url":null,"abstract":"This is a tutorial survey of recent information theoretic results dealing with the minimal randomness necessary for the generation of random processes with prescribed distributions.","PeriodicalId":423518,"journal":{"name":"Proceedings of 1994 Workshop on Information Theory and Statistics","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121144548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-parametric discriminatory power","authors":"H.J. Holz, M. Loew","doi":"10.1109/WITS.1994.513894","DOIUrl":"https://doi.org/10.1109/WITS.1994.513894","url":null,"abstract":"Discriminatory power is the relative usefulness of a feature for classification. Traditionally feature-selection techniques have defined discriminatory power in terms of a particular classifier. Non-parametric discriminately power allows feature selection to be based on the structure of the data rather than on the requirements of any one classifier. In previous research, we have defined a metric for non-parametric discriminatory power called relative feature importance (RFI). In this work, we explore the construction of RFI through closed-form analysis and experimentation. The behavior of RFI is also compared to traditional techniques.","PeriodicalId":423518,"journal":{"name":"Proceedings of 1994 Workshop on Information Theory and Statistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126077531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}