Data-derived weak universal consistency for lossless compression


Venkat Anantharam


University of California, Berkeley


Thursday, 2 June 2022, 16:00 to 17:00


  • A-201 and Zoom


Rich model classes for data may be too complex to admit uniformly consistent estimators. In such cases, it is conventional to settle for pointwise consistent estimators. But this viewpoint has the practical drawback that estimator performance is a function of the unknown model within the model class that is being estimated. Even if an estimator is consistent, how well it is doing at any given time may not be clear, no matter what the sample size of the observations.

We explore how to resolve this issue by studying model classes that may only admit pointwise consistency guarantees, yet enough information about the unknown model driving the observations needed to gauge estimator accuracy can be inferred from the sample at hand. We would then say that such model classes admit data-derived weak universally consistent estimators.

In this work we flesh out this philosophy in the framework of lossless data compression problems over a countable alphabet. Our main contribution is to characterize the model classes that admit data-derived weak universally consistent lossless compression in terms of the presence or not of what we term deceptive distributions (whether a distribution is deceptive or not is defined in the context of the model class). We also show that the ability to estimate the redundancy of compressing memoryless sources is equivalent to learning the underlying single-letter marginal in a data-derived fashion.

This is joint work with Narayana Prasad Santhanam and Wojtek Szpankowski.

Bio: Venkat Anantharam is on the faculty of the EECS department at U. C. Berkeley. He received his B. Tech. (1980) in Electrical Engineering (Electronics) from IIT Madras, and the M.S. (1982) and Phd. (1986) in Electrical Engineering and M.A. (1984) and C. Phil. (1985) in Mathematics from U. C. Berkeley. From 1986 to 1994 he was on the faculty of the School of EE at Cornell University. He has been with the EECS department at U. C. Berkeley since 1994. He is a Fellow of the IEEE and a Distinguished Alumnus of IIT Madras.