### Refine

#### Document Type

- Preprint (3) (remove)

#### Keywords

- Families of Probability Measures (1)
- Information Theory (1)
- Optimal Prior Distribution (1)
- Shannon capacity (1)
- Shannon-Capacity (1)
- Statistical Experiments (1)
- Structure Theory (1)
- deficiency (1)
- exponential rate (1)
- f-dissimilarity (1)

Questions arising from Statistical Decision Theory, Bayes Methods and other probability theoretic fields lead to concepts of orthogonality of a family of probability measures. In this paper we therefore give a sketch of a generalized information theory which is very helpful in considering and answering those questions. In this adapted information theory Shannon's classical transition channels modelled by finite stochastic matrices are replaced by compact families of probability measures that are uniformly integrable. These channels are characterized by concepts such as information rate and capacity and by optimal priors and the optimal mixture distribution. For practical studies we introduce an algorithm to calculate the capacity of the whole probability family which is appli cable even for general output space. We then explain how the algorithm works and compare its numerical costs with those of the classical Arimoto-Blahut-algorithm.

It is of basic interest to assess the quality of the decisions of a statistician, based on the outcoming data of a statistical experiment, in the context of a given model class P of probability distributions. The statistician picks a particular distribution P , suffering a loss by not picking the 'true' distribution P' . There are several relevant loss functions, one being based on the the relative entropy function or Kullback Leibler information distance. In this paper we prove a general 'minimax risk equals maximin (Bayes) risk' theorem for the Kullback Leibler loss under the hypothesis of a dominated and compact family of distributions over a Polish observation space with suitably integrable densities. We also find that there is always an optimal Bayes strategy (i.e. a suitable prior) achieving the minimax value. Further, we see that every such minimax optimal strategy leads to the same distribution P in the convex closure of the model class. Finally, we give some examples to illustrate the results and to indicate, how the minimax result reflects in the structure of least favorable priors. This paper is mainly based on parts of this author's doctorial thesis.

Let (Epsilon_k) be a sequence of experiments with the same finite parameter set. Suppose only that identification of the parameter is possible asymptotically. For large classes of information functionals we show that their exponential rates of convergence towards complete information coincide. As a special case we obtain the rate of the Shannon capacity of product experiments.