Conditional MI

Conditional MI#

Mutual Information (MI) in between two processes \(X\) and \(Y\) can also be conditioned on another process, such as \(Z\), known as conditional MI. Such conditional MI provides the shared information between \(X\) and \(Y\), when considering the knowledge of the conditional variable \(Z\) and is written as \(I(X;Y \mid Z)\).

\[\begin{split} \begin{align} I(X;Y \mid Z) &= -\sum_{x, y, z} p(z)p(x,y\mid z) \log \frac{p(x, y \mid z)}{p(x \mid z)p(y \mid z)}\\ &= -\sum_{x, y, z} p(x,y,z) \log \frac{p(x,y,z)p(z)}{p(x,z)p(y,z)}\\ &= H(X \mid Z) - H(X \mid Y,Z) \end{align} \end{split}\]

This package offers calculation of CMI for all approaches that Mutual Information (MI) offers. Furthermore, more than two variables are supported. In this case, CMI is defined as

\[\begin{split} \begin{align} I(X_1; X_2; \ldots; X_n \mid Z)&= -\sum_{x_1, x_2, \ldots, x_n, z} p(z)p(x_1,x_2,\ldots,x_n \mid z) \log \frac{p(x_1,x_2,\ldots,x_n \mid z)}{\prod p(x_i \mid z)}\\ &=-\sum_{x_1, x_2, \ldots, x_n, z} p(x_1,x_2,\ldots,x_n,z) \log \frac{p(x_1,x_2,\ldots,x_n,z)p(z)}{\prod p(x_i, z)}\\ &= - H(X_1, X_2, \ldots, X_n, Z) - H(Z) + \sum_{i=1}^n H(X_i, Z). \end{align} \end{split}\]

Local Conditional MI#

Similar to Local Conditional H, local or point-wise conditional MI can be defined as by Fano [Fan61]:

\[\begin{split} \begin{align} i(x; y \mid z) &= -\log_b \frac{p(x \mid y, z)}{p(x \mid z)}\\ &= h(x \mid z) - h(x \mid y, z) \end{align} \end{split}\]

The conditional MI can be calculated as the expected value of its local counterparts [Liz14a]:

\[ I(X; Y \mid Z) = \langle i(x; y \mid z) \rangle. \]

Note

The conditional MI \(I(X;Y \mid Z)\) can be either larger or smaller than its non-conditional counter-part, i.e., \(I(X; Y)\). This leads to the idea of Synergy and redundancy and can be addressed by information decomposition approach [WB11]. CMI is symmetric under the same condition \(Z\), \(I(X;Y \mid Z) = I(Y;X \mid Z)\).

This package also allows the user to calculate the Local values of CMI.

CMI Estimation#

The CMI expression can be expressed in the form of entropies and joint entropies as follows:

\[ I(X;Y \mid Z) = - H(X,Z,Y) + H(X,Z) + H(Z,Y) - H(Z) \]

While the package uses this formula internally for the Rényi and Tsallis CMI, all other approaches each are calculated with dedicated, probabilistic implementations.

import infomeasure as im

x = [0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0]
y = [1, 1, 0, 0, 2, 2, 1, 1, 0, 2, 0, 0, 2, 0, 0]
z = [0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0]
cmi = im.cmi(x, y, cond=z, approach='discrete')
cmi_ksg = im.cmi(x, y, cond=z, approach='ksg')
cmi_kernel = im.cmi(x, y, cond=z, approach='kernel', kernel='box', bandwidth=1.5)
cmi_symbolic = im.cmi(x, y, cond=z, approach='symbolic', embedding_dim=3)
cmi, cmi_ksg, cmi_kernel, cmi_symbolic
(np.float64(0.024586807355194827),
 np.float64(-0.19087246087246088),
 np.float64(0.02458680735519482),
 np.float64(0.9009098875771349))

With four variables, the CMI is calculated as follows:

from numpy.random import default_rng
rng = default_rng(917856)
im.cmi(
    rng.normal(size=1000),
    rng.normal(size=1000),
    rng.normal(size=1000),
    rng.normal(size=1000),
    cond=rng.normal(size=1000),
    approach='metric'
)
np.float64(-11.555648875334333)

The Local Conditional MI is calculated as follows:

est = im.estimator(
    x, y, cond=z,
    measure='cmi',  # or 'conditional_mutual_information'
    approach='discrete'
)
est.local_vals()
array([-0.22314355, -0.13353139, -0.40546511,  0.15415068, -0.22314355,
        0.15415068,  0.28768207,  0.15415068,  0.18232156, -0.13353139,
        0.18232156,  0.15415068,  0.28768207, -0.25131443,  0.18232156])