infomeasure.estimators package#

Subpackages#

Submodules#

infomeasure.estimators.base module#

Module containing the base classes for the measure estimators.

class infomeasure.estimators.base.ConditionalMutualInformationEstimator(*data, cond=None, normalize: bool = False, offset=None, base: int | float | str = 'e', **kwargs)[source]#

Bases: StatisticalTestingMixin, Estimator[ConditionalMutualInformationEstimator], ABC

Abstract base class for conditional mutual information estimators.

Conditional Mutual Information (CMI) between two (or more) random variables \(X\) and \(Y\) given a third variable \(Z\) quantifies the amount of information obtained about one variable through the other, conditioned on the third. In terms of entropy (H), CMI is expressed as:

\[I(X, Y | Z) = H(X, Z) + H(Y, Z) - H(X, Y, Z) - H(Z)\]

where \(H(X, Z)\) is the joint entropy of \(X\) and \(Z\), \(H(Y, Z)\) is the joint entropy of \(Y\) and \(Z\), \(H(X, Y, Z)\) is the joint entropy of \(X\), \(Y\), and \(Z\), and \(H(Z)\) is the entropy of \(Z\).

Attributes:
*dataarray_like, shape (n_samples,)

The data used to estimate the conditional mutual information. You can pass an arbitrary number of data arrays as positional arguments.

condarray_like

The conditional data used to estimate the conditional mutual information.

normalizebool, optional

If True, normalize the data before analysis. Default is False.

baseint | float | “e”, optional

The logarithm base for the entropy calculation. The default can be set with set_logarithmic_unit().

Raises:
ValueError

If the data arrays have different lengths.

ValueError

If the data arrays are not of the same length.

ValueError

If normalization is requested for non-1D data.

class infomeasure.estimators.base.ConditionalTransferEntropyEstimator(source, dest, *, cond=None, src_hist_len: int = 1, dest_hist_len: int = 1, cond_hist_len: int = 1, step_size: int = 1, prop_time=None, offset=None, base: int | float | str = 'e', **kwargs)[source]#

Bases: EffectiveValueMixin, StatisticalTestingMixin, Estimator[ConditionalTransferEntropyEstimator], ABC

Abstract base class for conditional transfer entropy estimators.

Conditional Transfer Entropy (CTE) from source \(X\) to destination \(Y\) given a condition \(Z\) quantifies the amount of information obtained about the destination variable through the source, conditioned on the condition.

Attributes:
sourcearray_like, shape (n_samples,)

The source data used to estimate the transfer entropy (X).

destarray_like, shape (n_samples,)

The destination data used to estimate the transfer entropy (Y).

condarray_like, shape (n_samples,)

The conditional data used to estimate the transfer entropy (Z).

step_sizeint

Step size between elements for the state space reconstruction.

src_hist_len, dest_hist_len, cond_hist_lenint

Number of past observations to consider for the source, destination, and conditional data.

prop_timeint, optional

Not compatible with the conditional transfer entropy.

baseint | float | “e”, optional

The logarithm base for the entropy calculation. The default can be set with set_logarithmic_unit().

class infomeasure.estimators.base.DiscreteHEstimator(*args, **kwargs)[source]#

Bases: EntropyEstimator, ABC

Abstract base class for discrete entropy estimators.

The DiscreteHEstimator class is an abstract base class extending EntropyEstimator. This class is specifically designed to handle entropy estimation for discrete variables. It ensures that input data is transformed into a format suitable for discrete entropy calculations, verifies input data validity, and reduces joint spaces where needed.

It works exclusively with symbolized or discretized data, allowing entropy computations to remain accurate and efficient for discrete variables. The class also manages situations where multiple random variables’ joint data can be reduced to simplified forms for further statistical analysis. The data, after processing, is represented using unique values and counts instead of directly storing the original data.

Attributes:
datatuple[DiscreteData]

A tuple containing Discrete data objects. Each of them contains, uniq, counts, N, K, and the original data array. For normal and joint entropy len(data) = 1, for cross-entropy len(data) = 2.

Methods

from_counts(uniq, counts[, base])

Construct a DiscreteHEstimator from the provided counts.

from_probabilities(uniq, probabilities[, base])

Construct a DiscreteHEstimator from the provided probabilities.

classmethod from_counts(uniq, counts, base: int | float | str = 'e', **kwargs)[source]#

Construct a DiscreteHEstimator from the provided counts.

DiscreteData validates the data integrity, other validations are skipped. This is used for JSD for DiscreteHEstimator childs.

classmethod from_probabilities(uniq, probabilities, base: int | float | str = 'e', **kwargs)[source]#

Construct a DiscreteHEstimator from the provided probabilities.

DiscreteData validates the data integrity, other validations are skipped. This is used for JSD for DiscreteHEstimator childs.

class infomeasure.estimators.base.EntropyEstimator(*data, base: int | float | str = 'e')[source]#

Bases: Estimator[EntropyEstimator], ABC

Abstract base class for entropy estimators.

Estimates simple entropy of a data array or joint entropy of two data arrays.

Attributes:
*dataarray_like, shape (n_samples,) or tuple of array_like

The data used to estimate the entropy. When passing a tuple of arrays, the joint entropy is considered. When passing two arrays, the cross-entropy is considered, the second RV relative to the first RV.

baseint | float | “e”, optional

The logarithm base for the entropy calculation. The default can be set with set_logarithmic_unit().

Methods

local_vals()

Return the local values of the measure, if available.

Raises:
ValueError

If the data is not an array or arrays tuple/list.

Notes

  • Entropy: When passing one array-like object.

  • Joint Entropy: When passing one tuple of array-likes.

  • Cross-Entropy: When passing two array-like objects. Then the the second distribution \(q\) is considered relative to the first \(p\):

    \(-\sum_{i=1}^{n} p_i \log_b q_i\)

local_vals()[source]#

Return the local values of the measure, if available.

For cross-entropy, local values cannot be calculated.

Returns:
localarray_like

The local values of the measure.

Raises:
io.UnsupportedOperation

If the local values are not available.

class infomeasure.estimators.base.Estimator(base: int | float | str = 'e')[source]#

Bases: Generic[EstimatorType], ABC

Abstract base class for all measure estimators.

Find Estimator Usage on how to use the estimators and an overview of the available measures (Available approaches).

Attributes:
res_globalfloat | None

The global value of the measure. None if the measure is not calculated.

res_localarray_like | None

The local values of the measure. None if the measure is not calculated or if not defined.

baseint | float | “e”, optional

The logarithm base for the entropy calculation. The default can be set with set_logarithmic_unit().

Methods

calculate()

Calculate the measure.

global_val()

Return the global value of the measure.

local_vals()

Return the local values of the measure, if available.

result()

Return the global value of the measure.

Notes

The _calculate() method needs to be implemented in the derived classes, for the local values or the global value. From local values, the global value is taken as the mean. If is to more efficient to directly calculate the global value, it is suggested to have _calculate() just return the global value, and have the separate _extract_local_values() method for the local values, which is lazily called by local_val(), if needed. If the measure has a p-value, the p_value() method should be implemented (use StatisticalTestingMixin for standard implementations).

final calculate() None[source]#

Calculate the measure.

Estimate the measure and store the results in the attributes.

final global_val() float[source]#

Return the global value of the measure.

Calculate the measure if not already calculated.

Returns:
globalfloat

The global value of the measure.

local_vals()[source]#

Return the local values of the measure, if available.

Returns:
localarray_like

The local values of the measure.

Raises:
io.UnsupportedOperation

If the local values are not available.

final result() float[source]#

Return the global value of the measure.

Calculate the measure if not already calculated.

Returns:
resultsfloat

The global value of the measure.

class infomeasure.estimators.base.MutualInformationEstimator(*data, offset: int = 0, normalize: bool = False, base: int | float | str = 'e', **kwargs)[source]#

Bases: StatisticalTestingMixin, Estimator[MutualInformationEstimator], ABC

Abstract base class for mutual information estimators.

Attributes:
*dataarray_like, shape (n_samples,)

The data used to estimate the mutual information. You can pass an arbitrary number of data arrays as positional arguments.

offsetint, optional

If two data arrays are provided: Number of positions to shift the data arrays relative to each other. Delay/lag/shift between the variables. Default is no shift. Assumed time taken by info to transfer from X to Y.

normalizebool, optional

If True, normalize the data before analysis. Default is False.

baseint | float | “e”, optional

The logarithm base for the entropy calculation. The default can be set with set_logarithmic_unit().

Raises:
ValueError

If the data arrays have different lengths.

ValueError

If the offset is not an integer.

ValueError

If offset is used with more than two data arrays.

class infomeasure.estimators.base.TransferEntropyEstimator(source, dest, *, prop_time: int = 0, src_hist_len: int = 1, dest_hist_len: int = 1, step_size: int = 1, offset: int = None, base: int | float | str = 'e', **kwargs)[source]#

Bases: EffectiveValueMixin, StatisticalTestingMixin, Estimator[TransferEntropyEstimator], ABC

Abstract base class for transfer entropy estimators.

Attributes:
sourcearray_like, shape (n_samples,)

The source data used to estimate the transfer entropy (X).

destarray_like, shape (n_samples,)

The destination data used to estimate the transfer entropy (Y).

step_sizeint

Step size between elements for the state space reconstruction.

src_hist_len, dest_hist_lenint

Number of past observations to consider for the source and destination data.

prop_timeint, optional

Number of positions to shift the data arrays relative to each other (multiple of step_size). Delay/lag/shift between the variables, representing propagation time. Default is no shift. Assumed time taken by info to transfer from source to destination.

baseint | float | “e”, optional

The logarithm base for the entropy calculation. The default can be set with set_logarithmic_unit().

Raises:
ValueError

If the data arrays have different lengths.

ValueError

If the propagation time is not an integer.

infomeasure.estimators.functional module#

Functional wrappers for information estimators.

This module provides functional interfaces to calculate entropy, mutual information, and transfer entropy. The estimators are dynamically imported based on the estimator name provided, saving time and memory by only importing the necessary classes.

infomeasure.estimators.functional.conditional_mutual_information(*data, **kwargs: any)[source]#

Conditional mutual information between two variables given a third variable.

See mutual_information for more information.

infomeasure.estimators.functional.conditional_transfer_entropy(*data, **kwargs: any)[source]#

Conditional transfer entropy between two variables given a third variable.

See transfer_entropy for more information.

infomeasure.estimators.functional.cross_entropy(*data, **kwargs: any)[source]#

Calculate the cross-entropy using a functional interface of different estimators.

See entropy() for more details on the parameters and returns.

infomeasure.estimators.functional.entropy(*data, approach: str, **kwargs: any)[source]#

Calculate the (joint) entropy using a functional interface of different estimators.

Supports the following approaches:

  1. ansb: Asymptotic NSB entropy estimator.

  2. bayes: Bayesian entropy estimator.

  3. bonachela: Bonachela entropy estimator.

  4. [chao_shen, cs]: Chao-Shen entropy estimator.

  5. [chao_wang_jost, cwj]: Chao Wang Jost entropy estimator.

  6. discrete: Discrete entropy estimator.

  7. grassberger: Grassberger entropy estimator.

  8. kernel: Kernel entropy estimator.

  9. [metric, kl]: Kozachenko-Leonenko entropy estimator.

  10. [miller_madow, mm]: Miller-Madow entropy estimator.

  11. nsb: NSB (Nemenman-Shafee-Bialek) entropy estimator.

  12. [ordinal, symbolic, permutation]: Ordinal / Permutation entropy estimator.

  13. renyi: Renyi entropy estimator.

  14. [shrink, js]: Shrinkage (James-Stein) entropy estimator.

  15. tsallis: Tsallis entropy estimator.

  16. zhang: Zhang entropy estimator.

For the discrete Shannon entropy this is

\[\texttt{im.entropy(data_X, approach="discrete")} = H(X) = -\sum_{x \in X} p(x) \log p(x).\]

Where for \(H(x)\), the estimated pmf \(p(x)\) belongs to the RV \(X\).

\[\texttt{im.entropy(data_P, data_Q, ...)} = H_Q(P) = H_\times(P, Q) = -\sum_{x \in X} p(x) \log q(x)\]

For the cross-entropy \(H_Q(P)\), the estimated pmf \(p(x)\) belongs to the RV \(P\) and \(q(x)\) to the RV \(Q\). For other approaches, this formula is generalized in different forms.

Parameters:
*dataarray_like

The data used to estimate the entropy. For entropy, this can be an array-like. For joint entropy, pass the joint values inside a tuple. For cross-entropy, pass two separate parameters.

approachstr

The name of the estimator to use.

**kwargs: dict

Additional keyword arguments to pass to the estimator.

Returns:
float

The calculated entropy.

Raises:
ValueError

If the estimator is not recognised.

infomeasure.estimators.functional.estimator(*data, cond=None, measure: str = None, approach: str = None, step_size: int = 1, prop_time: int = 0, src_hist_len: int = 1, dest_hist_len: int = 1, cond_hist_len: int = 1, **kwargs: any) EstimatorType[source]#

Get an estimator for a specific measure.

This function provides a simple interface to get an Estimator for a specific measure.

If you are only interested in the global result, use the functional interfaces:

Estimators available:

  1. Entropy:
  2. Mutual Information:
  3. Transfer Entropy:
Parameters:
*data

The data used to estimate the measure. For entropy: a single array-like data. A tuple of data for joint entropy. For cross-entropy: two array-like data. Second input RV relative to the first. For mutual information: arbitrary number of array-like data. For transfer entropy: two array-like data. Source and destination.

condarray_like, optional

Only if the measure is conditional transfer entropy.

measurestr

The measure to estimate. Options: entropy, cross_entropy, mutual_information, transfer_entropy, conditional_mutual_information, conditional_transfer_entropy; aliases: h, hx, mi, te, cmi, cte.

approachstr

The name of the estimator to use. Find the available estimators in the docstring of this function.

*args: tuple

Additional arguments to pass to the estimator.

**kwargs: dict

Additional keyword arguments to pass to the estimator.

Returns:
Estimator

The estimator instance.

Raises:
ValueError

If the measure is not recognised.

infomeasure.estimators.functional.get_estimator_class(measure=None, approach=None) EstimatorType[source]#

Get estimator class based on the estimator name and approach.

This function returns the estimator class based on the measure and approach provided. If you want an instance of an estimator, initialized with data and parameters, use the functional interface estimator().

Parameters:
measurestr

The measure to estimate. Options: entropy, mutual_information, transfer_entropy, conditional_mutual_information, conditional_transfer_entropy. Aliases: h, mi, te, cmi, cte.

approachstr

The name of the estimator to use.

Returns:
class

The estimator class.

Raises:
ValueError

If the measure is not recognized.

ValueError

If the approach is not recognized.

infomeasure.estimators.functional.mutual_information(*data, approach: str, **kwargs: any)[source]#

Calculate the mutual information using a functional interface of different estimators.

Supports the following approaches:

  1. ansb: Asymptotic NSB mutual information estimator.

  2. bayes: Bayesian mutual information estimator.

  3. bonachela: Bonachela mutual information estimator.

  4. chao_shen: Chao-Shen mutual information estimator.

  5. chao_wang_jost: Chao Wang Jost mutual information estimator.

  6. discrete: Discrete mutual information estimator.

  7. grassberger: Grassberger mutual information estimator.

  8. kernel: Kernel mutual information estimator.

  9. [metric, ksg]: Kraskov-Stoegbauer-Grassberger mutual information estimator.

  10. [miller_madow, mm]: Miller-Madow mutual information estimator.

  11. nsb: NSB (Nemenman-Shafee-Bialek) mutual information estimator.

  12. [ordinal, symbolic, permutation]: Ordinal mutual information estimator.

  13. renyi: Renyi mutual information estimator.

  14. shrink: Shrinkage (James-Stein) mutual information estimator.

  15. tsallis: Tsallis mutual information estimator.

  16. zhang: Zhang mutual information estimator.

Parameters:
*dataarray_like

The data used to estimate the (conditional) mutual information.

condarray_like, optional

The conditional data used to estimate the conditional mutual information.

approachstr

The name of the estimator to use.

normalizebool, optional

If True, normalize the data before analysis. Default is False. Not available for the discrete estimator.

**kwargs: dict

Additional keyword arguments to pass to the estimator.

Returns:
float

The calculated mutual information.

Raises:
ValueError

If the estimator is not recognised.

infomeasure.estimators.functional.transfer_entropy(*data, approach: str, **kwargs: any)[source]#

Calculate the transfer entropy using a functional interface of different estimators.

Supports the following approaches:

  1. ansb: Asymptotic NSB transfer entropy estimator.

  2. bayes: Bayesian transfer entropy estimator.

  3. bonachela: Bonachela transfer entropy estimator.

  4. chao_shen: Chao-Shen transfer entropy estimator.

  5. chao_wang_jost: Chao Wang Jost transfer entropy estimator.

  6. discrete: Discrete transfer entropy estimator.

  7. grassberger: Grassberger transfer entropy estimator.

  8. kernel: Kernel transfer entropy estimator.

  9. [metric, ksg]: Kraskov-Stoegbauer-Grassberger transfer entropy estimator.

  10. [miller_madow, mm]: Miller-Madow transfer entropy estimator.

  11. nsb: NSB (Nemenman-Shafee-Bialek) transfer entropy estimator.

  12. [ordinal, symbolic, permutation]: Ordinal transfer entropy estimator.

  13. renyi: Renyi transfer entropy estimator.

  14. shrink: Shrinkage (James-Stein) transfer entropy estimator.

  15. tsallis: Tsallis transfer entropy estimator.

  16. zhang: Zhang transfer entropy estimator.

Parameters:
source, destarray_like

The source (X) and destination (Y) data used to estimate the transfer entropy.

condarray_like, optional

The conditional data used to estimate the conditional transfer entropy.

approachstr

The name of the estimator to use.

step_sizeint

Step size between elements for the state space reconstruction.

src_hist_len, dest_hist_lenint

Number of past observations to consider for the source and destination data.

prop_timeint, optional

Number of positions to shift the data arrays relative to each other. Delay/lag/shift between the variables. Default is no shift. Assumed time taken by info to transfer from source to destination. Not compatible with the cond parameter / conditional TE. Alternatively called offset.

*args: tuple

Additional arguments to pass to the estimator.

**kwargs: dict

Additional keyword arguments to pass to the estimator.

Returns:
float

The calculated transfer entropy.

Raises:
ValueError

If the estimator is not recognised.

infomeasure.estimators.mixins module#

Mixin classes for estimators from .base.py.

class infomeasure.estimators.mixins.DiscreteMIMixin(*args, **kwargs)[source]#

Bases: object

Mixin for handling discrete mutual information computations.

Provides utilities and checks necessary for estimating discrete mutual information and conditional mutual information. Ensures that input data is suitable for these calculations and provides warnings when pre-processing steps, such as symbolizing or discretizing, are required.

Attributes:
dataAny

The primary data to be used in mutual information estimation. It should be symbolized or discretized if it contains floating-point types.

condAny, optional

The conditional data for conditional mutual information estimation. If provided, it should also be symbolized or discretized if it contains floating-point types.

class infomeasure.estimators.mixins.DiscreteTEMixin[source]#

Bases: object

Mixin class for discrete transfer entropy calculations.

Provides functionality to validate input data types for transfer entropy estimation processes. Ensures that source, destination, and conditional datasets are properly symbolized or discretized to prevent invalid results from using continuous floating-point data.

Attributes:
sourcearray_like

The source data array utilized in transfer entropy calculations.

destarray_like

The destination data array utilized in transfer entropy calculations.

condarray_like, optional

The conditional data array utilized in transfer entropy calculations when applicable.

class infomeasure.estimators.mixins.EffectiveValueMixin(*args, **kwargs)[source]#

Bases: StatisticalTestingMixin

Mixin for effective value calculation.

To be used as a mixin class with TransferEntropyEstimator derived classes. Inherit before the main class.

Attributes:
res_effectivefloat | None

The effective transfer entropy.

Methods

effective_val([method])

Return the effective value.

Notes

The effective value is the difference between the original value and the value calculated for the permuted data.

effective_val(method: str = None)[source]#

Return the effective value.

Calculates the effective value if not already done, otherwise returns the stored value.

Returns:
effectivefloat

The effective value.

class infomeasure.estimators.mixins.RandomGeneratorMixin(*args, seed=None, **kwargs)[source]#

Bases: object

Mixin for random state generation.

Attributes:
rngGenerator

The random state generator.

class infomeasure.estimators.mixins.StatisticalTestingMixin(*args, **kwargs)[source]#

Bases: RandomGeneratorMixin

Mixin for comprehensive statistical testing including p-values, t-scores, and confidence intervals.

There are two methods to perform statistical tests:

  • Permutation test: shuffle the data and calculate the measure.

  • Bootstrap: resample the data and calculate the measure.

The statistical_test() method provides comprehensive statistical analysis including p-value, t-score, and confidence intervals in a single call.

To be used as a mixin class with other Estimator Estimator classes. Inherit before the main class.

Methods

statistical_test([n_tests, method])

Perform comprehensive statistical test including p-value, t-score, and confidence intervals.

Raises:
NotImplementedError

If the statistical test is not implemented for the estimator.

Notes

The permutation test is a non-parametric statistical test to determine if the observed effect is significant. The null hypothesis is that the measure is not different from random, and the p-value is the proportion of permuted measures greater than the observed measure.

Confidence intervals are calculated using percentiles of the null distribution from the resampling procedure.

statistical_test(n_tests: int = None, method: str = None) StatisticalTestResult[source]#

Perform comprehensive statistical test including p-value, t-score, and confidence intervals.

Method can be “permutation_test” or “bootstrap”.

  • Permutation test: shuffle the data and calculate the measure.

  • Bootstrap: resample the data and calculate the measure.

Parameters:
n_testsint, optional

Number of permutations or bootstrap samples. Needs to be a positive integer. Default is the value set in the configuration.

methodstr, optional

The method to calculate the statistical test. Options are “permutation_test” or “bootstrap”. Default is the value set in the configuration.

Returns:
StatisticalTestResult

Comprehensive statistical test result containing p-value, t-score, and metadata. Percentiles can be calculated on demand using the percentile() method.

Raises:
ValueError

If the chosen method is unknown.

io.UnsupportedOperation

If the statistical test is not supported for the estimator type.

class infomeasure.estimators.mixins.WorkersMixin(*args, workers=1, **kwargs)[source]#

Bases: object

Mixin that adds an attribute for the numbers of workers to use.

Attributes:
n_workersint, optional

The number of workers to use. Default is 1. -1: Use as many workers as CPU cores available.

Module contents#

Measures module for infomeasure package.