Ordinal / Symbolic / Permutation MI Estimation

Ordinal / Symbolic / Permutation MI Estimation#

Mutual Information (MI) quantifies the information shared between two random variables \(X\) and \(Y\). For our purpose, let us write the expression of MI in between the two times series \(X_t\) and \(Y_t\) as:

\[ I(X_{t}; Y_t) = \sum_{x_{t}, y_t} p(x_{t}, y_t) \log \frac{p(x_{t}, y_t)}{p(x_{t}) p(y_t)} \]

where

\(p(x_t, y_t)\) is the joint probability distribution (probability density function, pdf),
\(p(x_t)\) and \(p(y_t)\) are the marginal probabilities (pdf) of \(X_t\) and \(Y_t\).

Ordinal MI estimation estimates the required probability density function (pdf) based on the ordinal structure. The details on the pdf estimation based on ordinal structure is provided in Ordinal / Symbolic / Permutation Entropy Estimation.

To demonstrate this MI, we generate a multivariate Gaussian distribution with two dimensions. The data is centred around the origin and has a correlation coefficient of \(\rho = 0.7\). The analytical equation of the other approaches does not hold; as for ordinal entropy, the pmf of the ordinal patterns is analysed.

import infomeasure as im
import numpy as np
rng = np.random.default_rng(692475)

rho = 0.7
data = rng.multivariate_normal([0, 0], [[1, rho], [rho, 1]], size=1000)
x, y = data[:, 0], data[:, 1]

im.mutual_information(x, y, approach="ordinal", embedding_dim=3)

np.float64(0.34089748112229157)

Introducing the offset:

im.mutual_information(x, y, approach="ordinal", embedding_dim=4, offset=1)

np.float64(0.651454917262977)

For three or more variables, add them as positional parameters.

data = rng.multivariate_normal([0, 0, 0], [[1, rho, 0], [rho, 1, 0], [0, 0, 1]], size=1000)
data_x, data_y, data_z = data[:, 0], data[:, 1], data[:, 2]
im.mutual_information(data_x, data_y, data_z, approach="ordinal", embedding_dim=2)

0.10337476635891688

Local Mutual Information and Hypothesis testing need an estimator instance.

est = im.estimator(data_x, data_y, measure="mi", approach="ordinal", embedding_dim=2)
stat_test = est.statistical_test(n_tests=50, method="permutation_test")
est.local_vals(), stat_test.p_value, stat_test.t_score, stat_test.confidence_interval(90), stat_test.percentile(50)

(array([ 0.35918,  0.37518, -0.56744, ..., -0.60577, -0.56744,  0.37518],
       shape=(999,)),
 np.float64(0.0),
 np.float64(-0.9899494936611665),
 array([0.10189, 0.10189]),
 np.float64(0.10189031942663873))

The estimator is implemented in the OrdinalMIEstimator class, which is part of the im.measures.mutual_information module.

class infomeasure.estimators.mutual_information.ordinal.OrdinalMIEstimator(*data, cond=None, embedding_dim: int = None, step_size: int = 1, stable: bool = False, offset: int = 0, base: int | float | str = 'e', **kwargs)[source]

Bases: BaseOrdinalMIEstimator, MutualInformationEstimator

Estimator for the Ordinal mutual information.

Attributes:

*dataarray_like, shape (n_samples,): The data used to estimate the mutual information. You can pass an arbitrary number of data arrays as positional arguments.
embedding_dimint: The size of the permutation patterns.
offsetint, optional: Number of positions to shift the data arrays relative to each other. Delay/lag/shift between the variables. Default is no shift.
*symbolsarray_like, shape (n_samples,): The symbolized data used to estimate the mutual information.

Raises:

ValueError

ValueError: If the embedding_dim is negative or not an integer.
ValueError: If offset and embedding_dim are such that the data is too small.

Notes

The ordinality will be determined via numpy.argsort(). There is no normalize option, as this would not influence the order of the data.
If embedding_dim is set to 1, the mutual information is always 0.