🌑

Explore / Study / Computer Science / Information Theory 1.8k words | 11 minutes

Information Theory and Perfect Secrecy

  1. Overview of Information Theory Concepts
    1. Entropy
    2. Conditional Entropy
    3. Chain Rule and other entropy relations
    4. Relative Entropy
    5. Mutual Information
  2. Perfect Secrecy in Information-Theoretic Terms
    1. Message and Key Entropies
    2. Perfect Secrecy in Information-Theoretic terms

Overview of Information Theory Concepts

Entropy

  • Definition. The Entropy H(X)H(X) of a discrete random variable XX is defined as

    H(X)=xXPr[X=x]log2Pr[X=x]H(X)=-\sum_{x \in \mathscr{X}} \operatorname{Pr}[X=x] \log _{2} \operatorname{Pr}[X=x]

    The entropy is equivalently stated as H(X)=E(log2Pr[X])H(X) = \mathbb{E} \left(-\log_2 \Pr[X]\right).

  • Entropy is a measure of the uncertainty of a random variable and is expressed in bits. We use the convention that 0log20=00 \log_2 0 = 0.

  • In general, we note that H(X)log2XH(X) \le \log_2 |\mathscr{X}|.

Conditional Entropy

  • Definition. The Conditional Entropy H(XY)H(X \mid Y) of a discrete random variable XX conditioned on another random variable YY is defined as

    H(XY)=xX,yYPr[X=x,Y=y]log2Pr[X=xY=y]H(X \mid Y)=-\sum_{x \in \mathscr{X}, y \in \mathscr{Y}} \operatorname{Pr}[X=x, Y=y] \log _{2} \operatorname{Pr}[X=x \mid Y=y]

    The conditional entropy is equivalently stated as H(XY)=E(log2Pr[XY])H(X \mid Y) = \mathbb{E}(-\log_2 \Pr[X \mid Y]).

  • We observe that

    H(XY):=yYPr[Y=y]H(XY=y)H(X \mid Y):=-\sum_{y \in \mathscr{Y}} \operatorname{Pr}[Y=y] H(X \mid Y=y)

Chain Rule and other entropy relations

  • The following Chain Rule holds

    H(XY)=H(X)+H(YX)=H(Y)+H(XY)H(XY) = H(X) + H(Y \mid X) = H(Y) + H(X \mid Y)

    Here H(XY)=H(X,Y)=xX,yYPr[X=x,Y=y]log2Pr[X=x,Y=y]H(X Y)=H(X, Y)=-\sum_{x \in \mathscr{X}, y \in \mathscr{Y}} \operatorname{Pr}[X=x, Y=y] \log _{2} \operatorname{Pr}[X=x, Y=y] denotes the entropy of the joint random variable (X,Y)(X, Y).

  • In general, we have H(X1X2Xn)=H(X1)+H(X2X1)+H(X3X1X2)++H(XnX1X2Xn1)H\left(X_{1} X_{2} \ldots X_{n}\right)=H\left(X_{1}\right)+H\left(X_{2} \mid X_{1}\right)+H\left(X_{3} \mid X_{1} X_{2}\right)+\ldots+H\left(X_{n} \mid X_{1} X_{2} \ldots X_{n-1}\right).

  • The uncertainty of a random variable XX can never increase by knowledge of the outcome of another random variable YY, i.e. H(XY)H(X)H(X \mid Y) \le H(X) with equality iff X,YX, Y are independent.

    We thus have H(XY)H(Y)+H(X)H(XY) \le H(Y) + H(X) with equality iff X,YX, Y are independent.

Relative Entropy

  • Definition. The relative entropy D(p(X)q(X))D(p(X) \Vert q(X)) between two probability distributions p(X),q(X)p(X), q(X) of a random variable XX is defined as

    D(p(X)q(X))=xXp(X=x)log2(p(X=x)q(X=x))D(p(X) \| q(X))=\sum_{x \in \mathscr{X}} p(X=x) \log _{2}\left(\frac{p(X=x)}{q(X=x)}\right)

  • The relative entropy is a measure of the distance between two probability distributions. It can be thought of as the inefficiency of assuming distribution q(X)q(X) when the correct distribution is p(X)p(X).

  • Note D(p(X)q(X))=EP(log2p(X)q(X))D(p(X) \| q(X))=\mathbb{E}_{P}\left(\log _{2} \frac{p(X)}{q(X)}\right).

  • In general D(p(X)q(X))D(q(X)p(X))D(p(X) \| q(X)) \neq D(q(X) \| p(X)).

Mutual Information

  • Definition. The Mutual Information I(X;Y)I(X; Y) between random variables XX and YY is defined as

    I(X;Y)=D(Pr[X,Y]Pr[X]Pr[Y])=xX,yYPr[X=x,Y=y]log2Pr[X=x,Y=y]Pr[X=x]Pr[Y=y]\begin{aligned} I(X ; Y) & =D(\operatorname{Pr}[X, Y] \| \operatorname{Pr}[X] \operatorname{Pr}[Y]) \\ & =\sum_{x \in \mathscr{X}, y \in \mathscr{Y} } \operatorname{Pr}[X=x, Y=y] \log _{2} \frac{\operatorname{Pr}[X=x, Y=y]}{\operatorname{Pr}[X=x] \operatorname{Pr}[Y=y]} \end{aligned}

  • The mutual information I(X;Y)I(X; Y) measures the information (in bits) we receive about the random variable XX when observing the outcome of the random variable YY.

    It also describes the information we receive about the random variable YY when observing the outcome of the random variable XX.

  • Mutual Information is also the a measure of the price for encoding (X,Y)(X, Y) as a pair of independent random variables, when in reality they are not.

  • The Mutual Information obeys the following properties

    I(X;Y)=H(X)H(XY)=H(Y)H(YX)I(X;Y)=H(X)+H(Y)H(X,Y)=I(Y;X)I(X;X)=H(X)\begin{array}{c} I(X ; Y)=H(X)-H(X \mid Y)=H(Y)-H(Y \mid X) \\ I(X ; Y)=H(X)+H(Y)-H(X, Y)=I(Y ; X) \\ I(X ; X)=H(X) \end{array}

    We have that I(X;Y)0I(X; Y) \ge 0 with equality if and only if X,YX, Y are independent.

    The Conditional Mutual Information I(X;YZ):=D(Pr[X,YZ]Pr[XZ]Pr[YZ])I(X ; Y \mid Z):=D(\operatorname{Pr}[X, Y \mid Z] \| \operatorname{Pr}[X \mid Z] \operatorname{Pr}[Y \mid Z]) where D(p[XZ]q[XZ]):=zXp(Z=z)xXp(X=xZ=z)log2p(X=xZ=z)q(X=xZ=z)D(p[X \mid Z] \| q[X \mid Z]):=\sum_{z \in \mathscr{X}} p(Z=z) \sum_{x \in \mathscr{X}} p(X=x \mid Z=z) \log _{2} \frac{p(X=x \mid Z=z)}{q(X=x \mid Z=z)} is the conditional relative entropy.

Perfect Secrecy in Information-Theoretic Terms

Message and Key Entropies

  • We have the key entropy

    H(KC)=kK,cCPr[K=k,C=c]log2Pr[K=kC=c]H(K \mid C)=-\sum_{k \in \mathscr{K}, c \in \mathscr{C}} \operatorname{Pr}[K=k, C=c] \log _{2} \operatorname{Pr}[K=k \mid C=c]

    and the message entropy

    H(M)=mMPr[M=m]log2Pr[M=m]H(M)=-\sum_{m \in \mathscr{M}} \operatorname{Pr}[M=m] \log _{2} \operatorname{Pr}[M=m]

  • The key entropy describes the uncertainty Eve faces regarding the unknown key a priori and the message entropy describes the uncertainty regarding the transmitted message.

  • The key equivocation

    H(KC)=kK,cCPr[K=k,C=c]log2Pr[K=kC=c]H(K \mid C)=-\sum_{k \in \mathscr{K}, c \in \mathscr{C}} \operatorname{Pr}[K=k, C=c] \log _{2} \operatorname{Pr}[K=k \mid C=c]

    and the message equivocation

    H(MC)=mM,cCPr[M=m,C=c]log2Pr[M=mC=c]H(M \mid C)=-\sum_{m \in \mathscr{M}, c \in \mathscr{C}} \operatorname{Pr}[M=m, C=c] \log _{2} \operatorname{Pr}[M=m \mid C=c]

    describe the remaining uncertainty after Eve observes the transmitted ciphertext.

Perfect Secrecy in Information-Theoretic terms

  • We have that H(KC)H(K)H(K\mid C) \le H(K) and H(MC)H(M)H(M\mid C) \le H(M).

    This reflects the fact that the uncertainties never increase by knowledge of the ciphertext.

  • An encryption scheme is said to have perfect secrecy if I(M;C)=0I(M; C) = 0.

  • In a system with perfect secrecy, the plaintext and the ciphertext are independent. When Eve observes the ciphertext, she obtains no information at all about the plaintext,
    i.e., H(MC)=H(M)H(M \mid C) = H(M).

  • Theorem. For an encryption scheme with perfect secrecy we have H(M)H(K)H(M) \le H(K).

— Mar 1, 2023

Creative Commons License
Information Theory and Perfect Secrecy by Lu Meng is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at About.