The paper presents a generalized framework for defining entropy and information measures, going beyond Shannon's original formulation based on message length. The key ideas are:
Uncertainty can be quantified through the reduction in optimal loss when moving from no knowledge to full knowledge about a random variable X, using an arbitrary loss function l(x, a) defined on the random variable X and actions a.
This uncertainty reduction is formalized as U{∅,Ω}→σ(X)(X), where {∅, Ω} represents no knowledge and σ(X) represents full knowledge about X.
Entropy H(X) is then defined as the uncertainty reduction from no knowledge to full knowledge. Conditional entropy H(X|σ) and information I(X; σ) are defined as uncertainty reductions to partial knowledge represented by a sub-σ-algebra σ.
This framework generalizes Shannon entropy and information, which correspond to the case where l is the log loss. Other examples include variance for square error loss, and Bregman information for Bregman divergences.
In the continuous case, H(X) and H(X|Y) can be infinite, reflecting the ability to store arbitrary amounts of information. However, I(X; Y) and I(X; Y|Z) can still be finite, quantifying uncertainty reduction to partial knowledge.
The framework also allows incorporating uncertainty about the true distribution pX, by considering a set Γ of candidate distributions and the uncertainty between Γ and pX.
Overall, the paper provides a unifying perspective on information-theoretic quantities, showing how they can be generalized beyond Shannon's original coding-theoretic motivation.
翻译成其他语言
从原文生成
arxiv.org
更深入的查询