Estimator Usage

Estimator Usage#

This page provides a brief overview of the intended use of the infomeasure package. There are three ways to use the package:

Using the utility functions provided in the package: im.entropy, im.mutual_information, im.transfer_entropy, and the conditional counterparts. For a full list, find the exposed Functions in the API Reference.
Using the Estimator classes through the quick access: im.estimator().
Directly importing the Estimator classes and using them.

Each estimator is described in detail in the following sections, e.g. Entropy, Mutual Information, and Transfer Entropy.

Before we start, let’s import the necessary packages.

import infomeasure as im
import numpy as np
rng = np.random.default_rng()

1. Utility functions#

The utility functions are the most straightforward way to calculate the information measures. They are designed to be easy to use and provide a quick way to calculate the information measures.

Entropy#

For example, to calculate the entropy() \(H(X)\) of a dataset, you can use the following code:

x = rng.integers(0, 2, size=1000)  # binary, uniform data
im.entropy(x, approach="discrete")

np.float64(0.6926350931428009)

The available approaches can either be found in the documentation of entropy(), or on the approach pages as chapters of the Entropy (H) section.

Joint Entropy#

Calculating joint entropy \(H(X_1, X_2, \ldots, X_n)\) is as simple as calling the same entropy function, but passing a tuple of random variables as the first argument, denoting joining the variables. If they are passed as separate parameters, Cross-Entropy would be considered.

y = rng.choice(["a", "b", "c"], size=1000)  # e.g., using strings as symbols
z = rng.choice([True, False], size=1000)  # e.g., using boolean values as symbols
im.entropy((x, y, z), approach="discrete")

np.float64(2.4811959536827417)

With these two functions, you can use the chain rule \(H(X|Y) = H(X, Y) - H(Y)\) to combine them to calculate the conditional entropy \(H(X|Y)\).

Important

If an random variable (RV) is passed as one parameter into a function im.entropy(x, **kw), it is always considered as one RV. This is why two separate RVs need to be passed as a tuple, if they should be considered as a joint variable im.entropy((x, y), **kw). For discrete data with multiple features per sample (n_samples, n_features), the joint probability is always considered. For continuous data, this is also the case. When two RVs are considered separately, they will be passed as separate parameters, i.e. im.entropy(p, q, **kw) as cross-entropy.

Cross-Entropy#

For two RVs \(P\) and \(Q\), you can calculate the cross-entropy \(H_Q(P)\) as follows:

import infomeasure as im

data_P = [0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0]
data_Q = [1, 1, 0, 0, 2, 2, 1, 1, 0, 2, 0, 0, 2, 0, 0]

# Cross-entropy between P and Q: H_Q(P) = H_x(P, Q)
h_q_p = im.cross_entropy(data_P, data_Q, approach="discrete")
h_q_p

np.float64(1.0232940864167606)

This formulation is generalized for other approaches (e.g., continuous).

from numpy.random import default_rng
rng = default_rng(921521569)
data_P = rng.normal(0.0, 15, size=200)
data_Q = rng.normal(1.0, 14, size=500)
im.cross_entropy(data_P, data_Q, approach="metric")

np.float64(4.612528459235515)

im.cross_entropy() and im.hx() are convenience functions around im.entropy(), so the initial entropy function can also always be used.

im.entropy(data_P, data_Q, approach="metric")

np.float64(4.612528459200269)

Mutual Information#

For mutual information() \(I(X; Y)\) between two variables \(X\) and \(Y\), you can use the following code:

x = rng.normal(0, 1, 1000)  # e.g., continuous data, gaussian distribution
y = rng.normal(0, 1, 1000)
im.mutual_information(x, y, approach="kernel", bandwidth=0.2, kernel="box")

np.float64(0.33949509967635505)

To move the two random variables relative to each other, introduce the keyword offset. Both input variables then are shifted by the given number against the origin, in opposite directions. This is useful to investigate temporal relationships between two variables.

An arbitrary number of variables can be passed to calculate the mutual information \(I(X_1; \ldots; X_n)\) between them. This has been called interaction information, among other names. Each variable needs to be passed as a var-positional parameter and all other variables need to be passed as keyword-only parameters, just like so:

z = rng.normal(0, 1, 1000)
w = rng.normal(0, 1, (1000, 2))  # "kernel" also supports multi-dimensional data
im.mutual_information(x, y, z, w, approach="kernel", bandwidth=0.2, kernel="gaussian")

np.float64(3.570411356702638)

The available options for the approach are listed in the docstring of mutual information(). An example for all functionality of each approach can be found in the subsections of Mutual Information (MI).

Conditional Mutual Information#

Conditional mutual information \(I(X; Y | Z)\) can be calculated using:

im.conditional_mutual_information(
    x, y, cond=z, approach="kernel", bandwidth=0.2, kernel="box"
)

np.float64(1.5498894107110004)

Here, the condition is a keyword-only parameter, as it is also possible to pass multiple variables for \(I(X_1; \ldots; X_n | Z)\).

im.conditional_mutual_information(
    x, y, z, cond=w, approach="kernel", bandwidth=0.2, kernel="box"
)

np.float64(7.129248963048488)

You can also directly use the im.mutual_information() function, to calculate the conditional mutual information, passing the cond parameter.

Transfer Entropy#

For transfer_entropy() \(T_{X\to Y}\), you can use the following code:

im.transfer_entropy(x, y, approach="metric", k = 4,
    step_size = 1, prop_time = 0, src_hist_len = 1, dest_hist_len = 1, noise_level=1e-8
)

np.float64(-0.028998636828084078)

The first given variable is considered as the source variable \(X\), the second as the destination variable \(Y\). Calling im.te(y, x, ...) calculates the transfer entropy from variable y to x. The package does not have insights of the user-assigned variable names.

Analogously to the offset in mutual information calculation, prop_time allows you to specify the time lag between the source and destination variables. Furthermore, src_hist_len and dest_hist_len specify the length of the history window for source and destination variables respectively. step_size, often denoted as \(\tau\) in the context of transfer entropy, specifies the time step between consecutive observations in the history window.

As for H and MI, the approaches are documented in transfer_entropy(), and also approach by approach in the subsections of Transfer Entropy (TE).

Conditional Transfer Entropy#

When calculating conditional transfer entropy \(T_{X\to Y|Z}\), the same parameters as in the normal transfer entropy are used, but with an additional random variable cond, which specifies the conditioning variable \(Z\), and cond_hist_len specifies the length of the history window for \(Z\).

im.conditional_transfer_entropy(
    x, y, cond=z, approach="ordinal", embedding_dim=3,
    src_hist_len=2, dest_hist_len=2, cond_hist_len=1
)

0.6841724588196392

Again, you can also directly use the im.transfer_entropy() function, to calculate the conditional transfer entropy, passing the cond parameter.

Composite Measures#

Jensen-Shannon Divergence and Kullback-Leiber Divergence are also available as composite measures. They can be accessed from im.jensen_shannon_divergence() and im.kullback_leiber_divergence() respectively, and can be called like so:

jsd = im.jensen_shannon_divergence(x, y, approach='ordinal', embedding_dim=3)
kl = im.kullback_leiber_divergence(x, y, approach='renyi', alpha=1.1)
jsd, kl

(np.float64(0.0016903286872214096), np.float64(0.0714969732971047))

For the approach, the aforementioned types of estimation techniques are available. All parameters the approach needs, here embedding_dim, are passed as keyword arguments.

Shorthands#

For convenience, there are further shorthand functions, respectively im.h(), im.hx(), im.mi(), im.te(), im.cmi(), im.cte(), im.jsd(), and im.kld(). They are aliases and used in the same way as the before mentioned functions.

Caution

In all utility functions, data always needs to be passed as var-positional parameters, except the conditional data.

im.mi(x=a, y=b, ...)                  # wrong
im.mi(a, b, ...)                      # correct
im.te(source=a, dest=b, cond=c, ...)  # wrong
im.te(a, b, cond=c, ...)              # correct

2. Estimator classes#

Estimator classes need to be used to obtain more specific results, like local values, p-values, t-scores and confidence intervals. infomeasure provides a set of classes that are used under the hood for the utility functions we just discussed. These classes can be used directly to calculate the information measures, or to access specific results and methods. With the im.estimator() function, you can create an estimator instance:

a = rng.integers(0, 10, size=1000)
b = rng.integers(0, 10, size=1000)
est = im.estimator(
    a.astype(int),       # data: x | x, y, ... | source, dest
    measure="entropy",   # "mutual_information", "transfer_entropy", "h", "mi", "te",
                         # "conditional_mutual_information", "cmi",
                         # "conditional_transfer_entropy", "cte"
    approach="discrete"  # "kernel", "metric", "kl", "ksg", "ordinal", "symbolic",
                         # "permutation", "renyi", "tsallis"
    # additional parameters for each approach, e.g. `cond = ...` to conditionalize
)
est.result(), est.local_vals()

(np.float64(2.295710363177784),
 array([2.07147, 2.26336, 2.40795, ..., 2.07147, 2.27303, 2.33304],
       shape=(1000,)))

The im.estimator() function uses the same parameters as the utility functions, only an additional measure needs to specify the type of information to estimate.

Global value#

To access the global value, as returned by the utility functions, we can use the global_val() method. result() is an alias to return the same global value. Once calculated, as above, asking for the same value again will not recalculate it.

est.global_val(), est.result()

(np.float64(2.295710363177784), np.float64(2.295710363177784))

Local values#

To return local values—Local Entropy, Local Mutual Information, Local Conditional MI, Local Transfer Entropy, or Local Conditional TE—use the local_vals() method.

est.local_vals()

array([2.07147, 2.26336, 2.40795, ..., 2.07147, 2.27303, 2.33304],
      shape=(1000,))

Hypothesis testing#

To perform hypothesis testing on the global value of an estimator, use the statistical_test() method. Both mutual information and transfer entropy estimators support comprehensive statistical testing that provides p-values, t-scores, and confidence intervals in a single method call.

est = im.estimator(a, b, measure="mutual_information",
                   approach="kernel", bandwidth=0.2, kernel="box")
stat_test = est.statistical_test(n_tests=50, method="permutation_test")
(est.result(), stat_test.p_value, stat_test.t_score,
 stat_test.confidence_interval(90), stat_test.percentile(50))

(np.float64(0.03590709370884237),
 np.float64(0.76),
 np.float64(-0.8928332945481133),
 array([0.0338 , 0.05455]),
 np.float64(0.04147249549338565))

The StatisticalTestResult object contains comprehensive statistical information including p-value, t-score, and metadata about the test performed.

Two methods for resampling are available for hypothesis testing:

Permutation test: This method shuffles the first random variable.
Bootstrap: This method resamples the first random variable with replacement.

Resampling one of the two random variables is removing the relationships between the variables, and thus used as null hypothesis.

stat_test = est.statistical_test(method="bootstrap", n_tests=100)
(stat_test.p_value, stat_test.t_score,
 stat_test.confidence_interval(90), stat_test.percentile(50))

(np.float64(0.8),
 np.float64(-0.8631311474647916),
 array([0.03095, 0.05434]),
 np.float64(0.04156123112321809))

Confidence intervals and percentiles#

The statistical test result provides flexible access to confidence intervals and percentiles of the null distribution:

# Get confidence intervals
ci_95 = stat_test.confidence_interval(95)  # 95% confidence interval
ci_90 = stat_test.confidence_interval(90)  # 90% confidence interval

# Get specific percentiles
median = stat_test.percentile(50)  # Median of null distribution
quartiles = stat_test.percentile([25, 75])  # First and third quartiles

print(f"95% CI: [{ci_95[0]:.4f}, {ci_95[1]:.4f}]")
print(f"90% CI: [{ci_90[0]:.4f}, {ci_90[1]:.4f}]")
print(f"Median: {median:.4f}")
print(f"Quartiles: [{quartiles[0]:.4f}, {quartiles[1]:.4f}]")

95% CI: [0.0297, 0.0566]
90% CI: [0.0309, 0.0543]
Median: 0.0416
Quartiles: [0.0375, 0.0465]

The confidence intervals and percentiles are calculated on demand from the test values, providing maximum flexibility for statistical analysis.

Effective value#

With infomeasure.estimators.mixins.EffectiveValueMixin.effective_val() the Effective Transfer Entropy \(\operatorname{eTE}\) can be calculated:

est = im.estimator(a, b, measure="transfer_entropy", approach="metric",
                   k = 4, step_size = 1, offset = 0,
                   src_hist_len = 1, dest_hist_len = 1, noise_level=1e-8)
est.effective_val()

np.float64(0.0371119831212395)

Available approaches#

The following table shows the available information measures and estimators, and which methods are available for each estimator.

Estimator functions#
Estimator	`result()` `global_val()`	`local_vals()`	`statistical_test()`	`effective_val()`
Entropy & Joint Entropy
`Discrete`	X	X
`Kernel`	X	X
`KL`	X	X
`Ordinal`	X	X
`Rényi`	X
`Tsallis`	X
`ANSB`	X
`Bayes`	X
`Bonachela`	X
`Chao-Shen`	X
`Chao Wang Jost`	X
`Grassberger`	X	X
`Miller-Madow`	X
`NSB`	X
`Shrinkage`	X	X[1]
`Zhang`	X
Mutual Information & CMI
`Discrete`	X	X	X
`Kernel`	X	X	X
`KSG`	X	X	X
`Ordinal`	X	X	X
`Rényi`	X		X
`Tsallis`	X		X
`ANSB`	X		X
`Bayes`	X		X
`Bonachela`	X		X
`Chao-Shen`	X		X
`Chao Wang Jost`	X		X
`Grassberger`	X	X	X
`Miller-Madow`	X		X
`NSB`	X		X
`Shrinkage`	X	X[1]	X
`Zhang`	X		X
Transfer Entropy & CTE
`Discrete`	X	X	X	X
`Kernel`	X	X	X	X
`KSG`	X	X	X	X
`Ordinal`	X	X	X	X
`Rényi`	X		X	X
`Tsallis`	X		X	X
`ANSB`	X		X	X
`Bayes`	X		X	X
`Bonachela`	X		X	X
`Chao-Shen`	X		X	X
`Chao Wang Jost`	X		X	X
`Grassberger`	X	X	X	X
`Miller-Madow`	X		X	X
`NSB`	X		X	X
`Shrinkage`	X	X[1]	X	X
`Zhang`	X		X	X

The methods from the table do the following:

result() & global_val(): Returns the global value of the information measure.
local_vals(): Returns the local values of the information measure.
statistical_test(): Returns comprehensive statistical test results including p-value, t-score, and confidence intervals.
effective_val(): Returns the effective transfer entropy.