infomeasure.estimators.entropy package

infomeasure.estimators.entropy package#

Submodules#

infomeasure.estimators.entropy.ansb module#

Module for the Asymptotic NSB entropy estimator.

class infomeasure.estimators.entropy.ansb.AnsbEntropyEstimator(*data, K: int = None, undersampled: float = 0.1, base: int | float | str = 'e')[source]#

Bases: DiscreteHEstimator

Asymptotic NSB entropy estimator.

The Asymptotic NSB (ANSB) estimator provides entropy estimation for extremely undersampled discrete data where the number of unique values K is comparable to the sample size N.

\[\hat{H}_{\text{ANSB}} = (C_\gamma - \log(2)) + 2 \log(N) - \psi(\Delta)\]

where \(C_\gamma \approx 0.5772156649\dots\) is Euler’s constant, \(\psi\) is the digamma function, and \(\Delta = N - K\) is the number of coincidences (repeated observations) in the data.

This estimator is specifically designed for the extremely undersampled regime where \(K \sim N\) and diverges with N when the data is well-sampled. The ANSB estimator requires that \(N/K \to 0\), which is checked by default using the undersampled parameter [NBvS04].

If there are no coincidences in the data (\(\Delta = 0\)), ANSB returns NaN as the estimator is undefined in this case.

Parameters:

*dataarray_like: The data used to estimate the entropy.
Kint, optional: The support size. If not provided, uses the observed support size.
undersampledfloat, default=0.1: Maximum allowed ratio N/K to consider data sufficiently undersampled. A warning is issued if this threshold is exceeded.
baseLogBaseType, default=Config.get(“base”): The logarithm base for entropy calculation.

Attributes:

*dataarray_like: The data used to estimate the entropy.

Notes

The ANSB estimator is based on the asymptotic expansion of the NSB estimator for the case of extreme undersampling. It provides a computationally efficient alternative to the full NSB estimator when \(K \sim N\).

Examples

>>> import infomeasure as im
>>> data = [1, 2, 3, 4, 5, 1, 2]  # Some repeated values
>>> im.entropy(data, approach='ansb')
np.float64(3.353104447353747)

infomeasure.estimators.entropy.bayes module#

Module for the Bayesian entropy estimator.

class infomeasure.estimators.entropy.bayes.BayesEntropyEstimator(*data, alpha: float | str, K: int = None, base: int | float | str = 'e')[source]#

Bases: DiscreteHEstimator

Bayesian entropy estimator.

Computes an estimate of Shannon entropy using Bayesian probability estimates with a Dirichlet prior characterized by concentration parameter α. This approach provides a principled way to handle sparse data and incorporate prior knowledge about the probability distribution.

The Bayesian probabilities are calculated as:

\[p_k^{\text{Bayes}} = \frac{n_k + \alpha}{N + K \alpha}\]

where \(n_k\) is the count of symbol \(k\), \(N\) is the total number of observations, \(K\) is the support size (number of unique symbols), and \(\alpha\) is the concentration parameter of the Dirichlet prior.

The entropy is then \(-\sum p_k^{\text{Bayes}} \log p_k^{\text{Bayes}}\), same as the maximum likelihood entropy estimator, also supporting local entropy values.

Concentration Parameter Choices

The concentration parameter α controls the strength of the prior belief in uniform distribution. Several well-established choices are available:

Jeffreys Prior (α = 0.5 = "jeffrey"): Non-informative prior that is invariant under reparameterization. Provides good performance for most applications [KT81].
Laplace Prior (α = 1.0 = "laplace"): Uniform prior that adds one pseudocount to each symbol [BP63]. Simple and widely used, equivalent to add-one smoothing.
Schürmann-Grassberger Prior (α = 1/K = "sch-grass"): Adaptive prior that scales with the alphabet size. Particularly effective for large alphabets.
Minimax Prior (α = √N/K = "min-max"): Minimises the maximum expected loss. Balances between sample size and alphabet size.

Attributes:

*dataarray_like: The data used to estimate the entropy.
alphafloat: The concentration parameter α of the Dirichlet prior.
Kint, optional: The support size. If not provided, uses the observed support size.

property bayes_probs#

property dist_dict#: Return the Bayesian distribution dictionary for JSD.

infomeasure.estimators.entropy.bonachela module#

Module for the Bonachela entropy estimator.

class infomeasure.estimators.entropy.bonachela.BonachelaEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Bonachela (Bonachela-Hinrichsen-Muñoz) entropy estimator for discrete data.

The Bonachela estimator computes the Shannon entropy using the formula from [BHM08]:

\[\hat{H}_{B} = \frac{1}{N+2} \sum_{i=1}^{K} \left( (n_i + 1) \sum_{j=n_i + 2}^{N+2} \frac{1}{j} \right)\]

where \(n_i\) are the counts for each unique value, \(K\) is the number of unique values, and \(N\) is the total number of observations.

This estimator is specially designed to provide a compromise between low bias and small statistical errors for short data series, particularly when the data sets are small and the probabilities are not close to zero.

Attributes:

*dataarray_like: The data used to estimate the entropy.

infomeasure.estimators.entropy.chao_shen module#

Module for the Chao-Shen entropy estimator.

class infomeasure.estimators.entropy.chao_shen.ChaoShenEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Chao-Shen entropy estimator.

\[\hat{H}_{CS} = - \sum_{i=1}^{K} \frac{\hat{p}_i^{CS} \log \hat{p}_i^{CS}}{1 - (1 - \hat{p}_i^{ML} C)^N}\]

where

\[\hat{p}_i^{CS} = C \cdot \hat{p}_i^{ML}\]

and \(C = 1 - \frac{f_1}{N}\) is the estimated coverage, \(f_1\) is the number of singletons (species observed exactly once), \(\hat{p}_i^{ML}\) is the maximum likelihood probability estimate, \(N\) is the sample size, and \(K\) is the number of observed species [CS03]. The Chao-Shen estimator provides a bias-corrected estimate of Shannon entropy that accounts for unobserved species through coverage estimation.

Attributes:

*dataarray_like: The data used to estimate the entropy.

infomeasure.estimators.entropy.chao_wang_jost module#

Module for the Chao Wang Jost entropy estimator.

class infomeasure.estimators.entropy.chao_wang_jost.ChaoWangJostEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Advanced bias-corrected Shannon entropy estimator using coverage estimation.

The Chao-Wang-Jost estimator provides improved entropy estimates for incomplete sampling scenarios by accounting for unobserved species through sophisticated statistical corrections. This estimator is particularly valuable when dealing with ecological data, text analysis, or any discrete distribution where the sample may not capture all possible outcomes.

The Chao-Wang-Jost estimator addresses the systematic underestimation of entropy in finite samples by applying sophisticated statistical corrections. Through coverage estimation using singleton and doubleton counts, it provides reliable entropy estimates even with small or incomplete samples. Based on species accumulation theory and Good-Turing estimation principles, this approach is particularly valuable when the sample doesn’t capture all possible outcomes, such as in ecological diversity studies with incomplete species sampling or text analysis where vocabulary may be incompletely observed. The estimator is especially useful when standard entropy estimators show systematic bias due to sample size limitations.

Standard entropy estimators often underestimate diversity in finite samples, especially when the sampling is incomplete. This estimator overcomes this limitation by leveraging information from rare species (singletons and doubletons) to estimate sample coverage and correct for unobserved species. The theoretical foundation in species accumulation curves and Good-Turing frequency estimation provide a robust statistical framework for addressing sampling bias issues.

Mathematical Foundation:

The estimator combines observed entropy with a correction term based on coverage estimation:

\[\hat{H}_{\text{CWJ}} = \sum_{1 \leq n_i \leq N-1} \frac{n_i}{N} \left(\sum_{k=n_i}^{N-1} \frac{1}{k} \right) + \frac{f_1}{N} (1 - A)^{-N + 1} \left\{ - \log(A) - \sum_{r=1}^{N-1} \frac{1}{r} (1 - A)^r \right\}\]

where the coverage parameter \(A\) is estimated as:

\[\begin{split}A = \begin{cases} \frac{2 f_2}{(N-1) f_1 + 2 f_2} \, & \text{if} \, f_2 > 0 \\ \frac{2}{(N-1)(f_1 - 1) + 2} \, & \text{if} \, f_2 = 0, \; f_1 \neq 0 \\ 1, & \text{if} \, f_1 = f_2 = 0 \end{cases}\end{split}\]

Here, \(f_1\) represents the number of singletons (species observed exactly once) and \(f_2\) the number of doubletons (species observed exactly twice) in the sample [CWJ13].

Attributes:

*dataarray_like: The data used to estimate the entropy.

See also

infomeasure.estimators.functional.entropy: Functional interface for entropy calculation
infomeasure.estimators.entropy.discrete.DiscreteEntropyEstimator: Standard maximum likelihood entropy estimator

Notes

The algorithm is adapted from the entropart R library [MH15]
The correction becomes negligible when samples are complete (\(f_1 = f_2 = 0\))

Examples

>>> import infomeasure as im
>>>
>>> # Basic usage with incomplete sampling scenario
>>> data = [1, 1, 2, 3, 4, 5]  # Many singletons suggest incomplete sampling
>>> h_cwj = im.entropy(data, approach="chao_wang_jost", base=2)
>>> h_standard = im.entropy(data, approach="discrete", base=2)
>>> print(f"Chao-Wang-Jost: {h_cwj:.3f} bits")
Chao-Wang-Jost: 3.635 bits
>>> print(f"Standard: {h_standard:.3f} bits")
Standard: 2.252 bits
>>>
>>> # Ecological diversity example
>>> species_counts = [1, 1, 1, 2, 2, 3, 5, 8]  # Species abundance data
>>> diversity = im.entropy(species_counts, approach="cwj", base="e")
>>> print(f"Species diversity: {diversity:.3f} nats")
Species diversity: 2.054 nats

infomeasure.estimators.entropy.discrete module#

Module for the discrete entropy estimator.

class infomeasure.estimators.entropy.discrete.DiscreteEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Standard Shannon entropy estimator for discrete data using maximum likelihood.

The discrete entropy estimator computes the Shannon entropy using the classical maximum likelihood approach:

\[\hat{H} = -\sum_{i=1}^{K} \hat{p}_i \log \hat{p}_i\]

where \(\hat{p}_i = \frac{n_i}{N}\) are the empirical probabilities, \(n_i\) are the counts for each unique value \(i\), \(K\) is the number of unique values, and \(N\) is the total number of observations.

This is the most fundamental entropy estimator and serves as the baseline for comparison with other bias-corrected estimators. While it provides an asymptotically unbiased estimate of the true entropy, it can exhibit significant bias for small sample sizes, particularly when the number of unique values is large relative to the sample size.

The estimator is suitable for:

Large datasets where bias is minimal
Baseline comparisons with bias-corrected estimators
Applications where computational simplicity is preferred
Well-sampled distributions with sufficient observations per unique value

For small sample sizes or distributions with many rare events, consider using bias-corrected estimators such as ChaoShenEntropyEstimator, BonachelaEntropyEstimator, or ZhangEntropyEstimator.

Attributes:

*dataarray_like: The data used to estimate the entropy. For joint entropy, multiple arrays can be provided.
basefloat or str, default=Config.get(“base”): The logarithm base for entropy calculation. Common values are 2 (bits), 10 (dits), or ‘e’ (nats).

Examples

>>> import infomeasure as im
>>> # Simple entropy calculation
>>> data = [1, 1, 2, 3, 3, 4, 5]
>>> entropy_value = im.entropy(data, approach="discrete")
>>> print(f"Entropy: {entropy_value:.3f} nats")
Entropy: 1.550 nats
>>> # Local values
>>> estimator = im.estimator(data, measure="h", approach="discrete")
>>> estimator.local_vals()
array([1.25276297, 1.25276297, 1.94591015, 1.25276297, 1.25276297,
   1.94591015, 1.94591015])

property dist_dict#: Return the distribution dictionary for JSD.

infomeasure.estimators.entropy.grassberger module#

Module for the discrete Grassberger entropy estimator.

class infomeasure.estimators.entropy.grassberger.GrassbergerEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Discrete Grassberger entropy estimator.

\[\hat{H}_{\text{Gr88}} = \sum_i \frac{n_i}{H} \left(\log(N) - \psi(n_i) - \frac{(-1)^{n_i}}{n_i + 1} \right)\]

\(\hat{H}_{\text{Gr88}}\) is the Grassberger entropy, where \(n_i\) are the counts, \(H\) is the total number of observations \(N\), and \(\psi\) is the digamma function [Gra08, Gra88].

Attributes:

*dataarray_like: The data used to estimate the entropy.

infomeasure.estimators.entropy.kernel module#

Module for the kernel entropy estimator.

class infomeasure.estimators.entropy.kernel.KernelEntropyEstimator(*data, bandwidth: float | int, kernel: str, workers: int = 1, base: int | float | str = 'e')[source]#

Bases: WorkersMixin, EntropyEstimator

Kernel entropy estimator for continuous data using Kernel Density Estimation (KDE).

The kernel entropy estimator computes the differential Shannon entropy by estimating the probability density function using kernel density estimation:

\[\hat{H}(X) = -\int \hat{f}(x) \log \hat{f}(x) \, dx \approx -\frac{1}{N} \sum_{i=1}^{N} \log \hat{f}(x_i)\]

where \(\hat{f}(x)\) is the kernel density estimate:

\[\hat{f}(x) = \frac{1}{N h^d} \sum_{i=1}^{N} K\left(\frac{x - x_i}{h}\right)\]

with \(K(\cdot)\) being the kernel function, \(h\) the bandwidth parameter, \(d\) the dimensionality, and \(N\) the number of data points.

For joint entropy of multiple variables, the estimator concatenates the variables into a single multivariate space and applies the same KDE approach.

The estimator supports both Gaussian and box (uniform) kernels. The choice of bandwidth is critical: small values can lead to under-smoothing and overfitting, while large values may over-smooth the data and obscure important features [GP25, Sil86].

Parameters:

*dataarray_like

The continuous data used to estimate the entropy. For univariate entropy, pass a single array. For joint entropy, pass multiple arrays.

bandwidthfloat | int

The bandwidth parameter for the kernel. Controls the smoothness of the density estimate.

kernelstr

Type of kernel to use. Supported options are:

'gaussian': Gaussian (normal) kernel
'box': Box (uniform) kernel

Compatible with the KDE implementation kde_probability_density_function().

workersint, optional

Number of workers to use for parallel processing. Default is 1 (no parallelization). If set to -1, all available CPU cores will be used.

basefloat | str, optional

Logarithm base for entropy calculation. Default is from global configuration.

Attributes:

*dataarray_like: The data used to estimate the entropy.
bandwidthfloat | int: The bandwidth for the kernel.
kernelstr: Type of kernel to use.
workersint: Number of workers to use for parallel processing.

Returns:

array_like: Local entropy values for each data point when calling entropy calculation methods. The mean of these values gives the overall entropy estimate.

See also

infomeasure.estimators.utils.kde.kde_probability_density_function: Underlying KDE implementation
infomeasure.estimators.entropy.discrete.DiscreteEntropyEstimator: For discrete data entropy estimation

Notes

Bandwidth Selection: The bandwidth parameter critically affects the quality of the entropy estimate. A small bandwidth can lead to under-sampling and high variance, while a large bandwidth may over-smooth the data, obscuring important details and introducing bias.

Kernel Choice:

Gaussian kernels provide smooth density estimates and are theoretically well-founded
Box kernels are computationally efficient and provide non-parametric estimates

Computational Complexity: The algorithm has O(N²) complexity for box kernels using KDTree queries, and varies for Gaussian kernels depending on the implementation.

Cross-entropy: Supported between two distributions by evaluating the density of the second distribution at points from the first distribution.

Examples

>>> import infomeasure as im
>>> from numpy.random import default_rng
>>> rng = default_rng(281769)
>>> # Generate sample data
>>> data = rng.normal(0, 1, 1000)
>>>
>>> # Create estimator
>>> estimator = im.estimator(data, measure="h", approach="kernel", bandwidth=0.5, kernel='gaussian')
>>>
>>> # Calculate entropy
>>> estimator.result()
np.float64(1.366015332652949)
>>> # Local values
>>> estimator.local_vals()
array([1.54017083, 1.35855839, 0.97949819, 0.97333173, 2.62084886,
   ...
   1.08174049, 0.97418054, 1.88055967, 0.99614516, 0.98548583])

infomeasure.estimators.entropy.kozachenko_leonenko module#

Module for the Kozachenko-Leonenko entropy estimator.

class infomeasure.estimators.entropy.kozachenko_leonenko.KozachenkoLeonenkoEntropyEstimator(*data, k: int = 4, ksg_id: int = 1, noise_level=1e-10, minkowski_p=inf, base: int | float | str = 'e')[source]#

Bases: RandomGeneratorMixin, EntropyEstimator

Kozachenko-Leonenko entropy estimator for continuous data.

The Kozachenko-Leonenko estimator computes the Shannon entropy of continuous data using nearest neighbor distances. The estimator is based on the method from [KL87] and follows the implementation approach described in [KSG11].

\[\hat{H}_{KL} = -\psi(k) + \psi(N) + \log(c_d) + \frac{d}{N} \sum_{i=1}^{N} \log(2\rho_{k,i})\]

where \(\psi\) is the digamma function, \(k\) is the number of nearest neighbors, \(N\) is the number of data points, \(d\) is the dimensionality, \(c_d\) is the volume of the \(d\)-dimensional unit ball for the chosen Minkowski norm, and \(\rho_{k,i}\) is the distance to the \(k\)-th nearest neighbor of point \(i\).

This estimator is particularly suitable for continuous multivariate data and provides asymptotically unbiased estimates of differential entropy. The method works by exploiting the relationship between nearest neighbor distances and local density, making it effective for high-dimensional data where traditional histogram-based methods fail.

Parameters:

*dataarray_like: The continuous data used to estimate the entropy. For multivariate data, each variable should be a column.
kint, default=4: The number of nearest neighbors to consider. Higher values provide more stable estimates but may introduce bias. The default value of 4 is recommended by [KSG11].
noise_levelfloat, default=1e-10: The standard deviation of Gaussian noise added to the data to avoid issues with zero distances between identical points. Set to 0 to disable noise addition.
minkowski_pfloat, default=inf: The power parameter for the Minkowski metric used in distance calculations. Common values are 2 (Euclidean distance) and inf (maximum norm/Chebyshev distance). Must satisfy \(1 \leq p \leq \infty\).
ksg_idint, default=1: The KSG estimator variant to use (1 or 2). Type I uses the standard formula. Type II uses a modified formula with \(\psi(k) - 1/k\).
baseLogBaseType, default=Config.get(“base”): The logarithm base for entropy calculation. Can be 2, 10, “e”, or any positive number.

Attributes:

*datatuple[array_like]: The processed data used to estimate the entropy, converted to 2D arrays.
kint: The number of nearest neighbors to consider.
noise_levelfloat: The standard deviation of the Gaussian noise added to the data.
minkowski_pfloat: The power parameter for the Minkowski metric.
ksg_idint: The KSG estimator variant to use.

Raises:

ValueError: If the number of nearest neighbors is not a positive integer.
ValueError: If the noise level is negative.
ValueError: If the Minkowski power parameter is invalid (not in range [1, ∞]).

Notes

The choice of the number of nearest neighbors \(k\) affects the bias-variance tradeoff of the estimator. Smaller values of \(k\) reduce bias but increase variance, while larger values have the opposite effect. The default value of \(k=4\) provides a good balance for most applications.

The noise addition helps handle datasets with repeated values or points that are exactly identical, which would otherwise result in zero distances and numerical issues. The noise level should be small enough not to significantly alter the underlying distribution.

For high-dimensional data, the curse of dimensionality may affect the estimator’s performance, as nearest neighbor distances become less informative. In such cases, dimensionality reduction or alternative entropy estimation methods may be preferable.

Examples

>>> import numpy as np
>>> import infomeasure as im
>>>
>>> # Generate 2D Gaussian data
>>> np.random.seed(176250)
>>> data = np.random.multivariate_normal([0, 0], [[1, 0.5], [0.5, 1]], 1000)
>>>
>>> # Estimate entropy
>>> estimator = im.estimator(data, measure="h", approach="kl", k=4)
>>> entropy_value = estimator.result()
>>> print(f"Estimated entropy: {entropy_value:.3f}")
Estimated entropy: 2.678
>>> print(f"Local values: {estimator.local_vals()}")
Local values: [ 3.15330798  2.02688591  2.52250064  2.95236651  3.58801879  1.42033673
    ...
    2.91254223  1.92823136  3.63647704  2.05589055]
>>> # Use different distance metric
>>> estimator_euclidean = KozachenkoLeonenkoEntropyEstimator(data, k=4, minkowski_p=2)
>>> entropy_euclidean = estimator_euclidean.entropy()
np.float64(2.6772465397252208)

infomeasure.estimators.entropy.miller_madow module#

Module for the discrete Miller-Madow entropy estimator.

class infomeasure.estimators.entropy.miller_madow.MillerMadowEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Discrete Miller-Madow entropy estimator.

\[\hat{H}_{\tiny{MM}} = \hat{H}_{\tiny{MLE}} + \frac{K - 1}{2N}\]

\(\hat{H}_{\tiny{MM}}\) is the Miller-Madow entropy, where \(\hat{H}_{\tiny{MLE}}\) is the maximum likelihood entropy (DiscreteEntropyEstimator). \(K\) is the number of unique values in the data, and \(N\) is the number of observations.

Attributes:

*dataarray_like: The data used to estimate the entropy.

infomeasure.estimators.entropy.nsb module#

Module for the NSB (Nemenman-Shafee-Bialek) entropy estimator.

class infomeasure.estimators.entropy.nsb.NsbEntropyEstimator(*data, K: int = None, base: int | float | str = 'e')[source]#

Bases: DiscreteHEstimator

NSB (Nemenman-Shafee-Bialek) entropy estimator.

The NSB estimator provides a Bayesian estimate of Shannon entropy for discrete data using the Nemenman, Shafee, Bialek algorithm. This estimator is particularly effective for undersampled data where traditional estimators may be biased.

The NSB estimate is computed as:

\[\hat{H}^{\text{NSB}} = \frac{ \int_0^{\ln(K)} d\xi \, \rho(\xi, \textbf{n}) \langle H^m \rangle_{\beta (\xi)} } { \int_0^{\ln(K)} d\xi \, \rho(\xi\mid \textbf{n})}\]

where

\[\rho(\xi \mid \textbf{n}) = \mathcal{P}(\beta (\xi)) \frac{ \Gamma(\kappa(\xi))}{\Gamma(N + \kappa(\xi))} \prod_{i=1}^K \frac{\Gamma(n_i + \beta(\xi))}{\Gamma(\beta(\xi))}\]

The algorithm uses numerical integration to compute the Bayesian posterior over possible entropy values, providing a principled approach to entropy estimation that accounts for sampling uncertainty [NSB02].

If there are no coincidences in the data (all observations are unique), NSB returns NaN as the estimator requires repeated observations to function properly.

Parameters:

*dataarray_like: The data used to estimate the entropy.
Kint, optional: The support size. If not provided, uses the observed support size.
baseLogBaseType, default=Config.get(“base”): The logarithm base for entropy calculation.

Attributes:

*dataarray_like: The data used to estimate the entropy.

Notes

The NSB estimator is computationally intensive as it requires numerical integration and optimisation. For large datasets or when computational efficiency is critical, consider using the asymptotic NSB (ANSB) estimator AnsbEntropyEstimator instead.

The estimator assumes a uniform prior over the space of possible probability distributions and uses Bayesian inference to estimate the entropy.

Examples

>>> import infomeasure as im
>>> data = [1, 2, 3, 4, 5, 1, 2]  # Some repeated values
>>> im.entropy(data, approach='nsb')
np.float64(1.4526460202102247)

infomeasure.estimators.entropy.ordinal module#

Module for the Ordinal / Permutation entropy estimator.

class infomeasure.estimators.entropy.ordinal.OrdinalEntropyEstimator(*data, embedding_dim: int, step_size: int = 1, stable: bool = False, base: int | float | str = 'e')[source]#

Bases: EntropyEstimator

Estimator for the Ordinal / Permutation entropy.

The Ordinal entropy is a measure of the complexity of a time series. The input data needs to be comparable, i.e., the data should be ordinal, as the relative frequencies are calculated. For a given embedding_dim (length of considered subsequences), all \(n!\) possible permutations are considered and their relative frequencies are calculated [BP02].

Embedding delay is not supported natively.

Attributes:

*dataarray_like: The data used to estimate the entropy.
embedding_dimint: The size of the permutation patterns.
step_sizeint, optional: The step size for the sliding windows (delay). Default is 1.
stablebool, optional: If True, when sorting the data, the embedding_dim of equal elements is preserved. This can be useful for reproducibility and testing, but might be slower.

Raises:

ValueError: If the embedding_dim is negative or not an integer.
ValueError: If the embedding_dim is too large for the given data.
TypeError: If the data are not 1d array-like(s).

Notes

The ordinality will be determined via numpy.argsort().
If embedding_dim is set to 1, the entropy is always 0.

infomeasure.estimators.entropy.renyi module#

Module for the Rényi entropy estimator.

class infomeasure.estimators.entropy.renyi.RenyiEntropyEstimator(*data, k: int = 4, alpha: float | int = None, base: int | float | str = 'e')[source]#

Bases: EntropyEstimator

Estimator for the Rényi entropy.

Attributes:

*dataarray_like: The data used to estimate the entropy.
kint: The number of nearest neighbors used in the estimation.
alphafloat | int: The Rényi parameter, order or exponent. Sometimes denoted as \(\alpha\) or \(q\).

Raises:

ValueError: If the Renyi parameter is not a positive number.
ValueError: If the number of nearest neighbors is not a positive integer.

Notes

The Rényi entropy is a generalization of Shannon entropy, where the small values of probabilities are emphasized for \(\alpha < 1\), and higher probabilities are emphasized for \(\alpha > 1\). For \(\alpha = 1\), it reduces to Shannon entropy. The Rényi-Entropy class can be particularly interesting for systems where additivity (in Shannon sense) is not always preserved, especially in nonlinear complex systems, such as when dealing with long-range forces.

infomeasure.estimators.entropy.shrink module#

Module for the shrink (James-Stein) entropy estimator.

class infomeasure.estimators.entropy.shrink.ShrinkEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Shrinkage (James-Stein) entropy estimator.

This estimator applies James-Stein shrinkage to the probability estimates before computing entropy, which can reduce bias in small sample scenarios. The shrinkage probabilities are calculated as:

\[\hat{p}_x^{\text{SHR}} = \lambda t_x + (1 - \lambda) \hat{p}_x^{\text{ML}}\]

where \(\hat{p}_x^{\text{ML}}\) are the maximum likelihood probability estimates, \(t_x = 1/K\) is the uniform target distribution, and the shrinkage parameter \(\lambda\) is given by:

\[\lambda = \frac{ 1 - \sum_{x=1}^{K} (\hat{p}_x^{\text{SHR}})^2}{(n-1) \sum_{x=1}^K (t_x - \hat{p}_x^{\text{ML}})^2}\]

The entropy is then computed using these shrinkage-corrected probabilities.

Based on the implementation in the R package entropy [HS09].

Attributes:

*dataarray_like: The data used to estimate the entropy.

property dist_dict#: Dictionary of shrinkage probabilities for each unique value. Used by JSD.

infomeasure.estimators.entropy.tsallis module#

Module for Tsallis entropy estimator.

class infomeasure.estimators.entropy.tsallis.TsallisEntropyEstimator(*data, k: int = 4, q: float | int = None, base: int | float | str = 'e')[source]#

Bases: EntropyEstimator

Estimator for the Tsallis entropy.

Attributes:

*dataarray_like: The data used to estimate the entropy.
kint: The number of nearest neighbors used in the estimation.
qfloat: The Tsallis parameter, order or exponent. Sometimes denoted as \(q\), analogous to the Rényi parameter \(\alpha\).

Raises:

ValueError: If the Tsallis parameter is not a positive number.
ValueError: If the number of nearest neighbors is not a positive integer.

Notes

In the \(q \to 1\) limit, the Jackson sum (q-additivity) reduces to ordinary summation, and the Tallis entropy reduces to Shannon Entropy. This class of entropy measure is in particularly useful in the study in connection with long-range correlated systems and with non-equilibrium phenomena.

infomeasure.estimators.entropy.zhang module#

Module for the Zhang entropy estimator.

class infomeasure.estimators.entropy.zhang.ZhangEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Zhang entropy estimator for discrete data.

The Zhang estimator computes the Shannon entropy using the recommended definition from [GZZ13]:

\[\hat{H}_Z = \sum_{i=1}^K \hat{p}_i \sum_{v=1}^{N - n_i} \frac{1}{v} \prod_{j=0}^{v-1} \left( 1 + \frac{1 - n_i}{N - 1 - j} \right)\]

where \(\hat{p}_i\) are the empirical probabilities, \(n_i\) are the counts for each unique value, \(K\) is the number of unique values, and \(N\) is the total number of observations.

The actual algorithm implementation follows the fast calculation approach from [LCBFiC17].

Attributes:

*dataarray_like: The data used to estimate the entropy.

Module contents#

Entropy estimators.

class infomeasure.estimators.entropy.AnsbEntropyEstimator(*data, K: int = None, undersampled: float = 0.1, base: int | float | str = 'e')[source]#

Bases: DiscreteHEstimator

Asymptotic NSB entropy estimator.

The Asymptotic NSB (ANSB) estimator provides entropy estimation for extremely undersampled discrete data where the number of unique values K is comparable to the sample size N.

\[\hat{H}_{\text{ANSB}} = (C_\gamma - \log(2)) + 2 \log(N) - \psi(\Delta)\]

where \(C_\gamma \approx 0.5772156649\dots\) is Euler’s constant, \(\psi\) is the digamma function, and \(\Delta = N - K\) is the number of coincidences (repeated observations) in the data.

This estimator is specifically designed for the extremely undersampled regime where \(K \sim N\) and diverges with N when the data is well-sampled. The ANSB estimator requires that \(N/K \to 0\), which is checked by default using the undersampled parameter [NBvS04].

If there are no coincidences in the data (\(\Delta = 0\)), ANSB returns NaN as the estimator is undefined in this case.

Parameters:

*dataarray_like: The data used to estimate the entropy.
Kint, optional: The support size. If not provided, uses the observed support size.
undersampledfloat, default=0.1: Maximum allowed ratio N/K to consider data sufficiently undersampled. A warning is issued if this threshold is exceeded.
baseLogBaseType, default=Config.get(“base”): The logarithm base for entropy calculation.

Attributes:

*dataarray_like: The data used to estimate the entropy.

Notes

The ANSB estimator is based on the asymptotic expansion of the NSB estimator for the case of extreme undersampling. It provides a computationally efficient alternative to the full NSB estimator when \(K \sim N\).

Examples

>>> import infomeasure as im
>>> data = [1, 2, 3, 4, 5, 1, 2]  # Some repeated values
>>> im.entropy(data, approach='ansb')
np.float64(3.353104447353747)

class infomeasure.estimators.entropy.BayesEntropyEstimator(*data, alpha: float | str, K: int = None, base: int | float | str = 'e')[source]#

Bases: DiscreteHEstimator

Bayesian entropy estimator.

Computes an estimate of Shannon entropy using Bayesian probability estimates with a Dirichlet prior characterized by concentration parameter α. This approach provides a principled way to handle sparse data and incorporate prior knowledge about the probability distribution.

The Bayesian probabilities are calculated as:

\[p_k^{\text{Bayes}} = \frac{n_k + \alpha}{N + K \alpha}\]

where \(n_k\) is the count of symbol \(k\), \(N\) is the total number of observations, \(K\) is the support size (number of unique symbols), and \(\alpha\) is the concentration parameter of the Dirichlet prior.

The entropy is then \(-\sum p_k^{\text{Bayes}} \log p_k^{\text{Bayes}}\), same as the maximum likelihood entropy estimator, also supporting local entropy values.

Concentration Parameter Choices

The concentration parameter α controls the strength of the prior belief in uniform distribution. Several well-established choices are available:

Jeffreys Prior (α = 0.5 = "jeffrey"): Non-informative prior that is invariant under reparameterization. Provides good performance for most applications [KT81].
Laplace Prior (α = 1.0 = "laplace"): Uniform prior that adds one pseudocount to each symbol [BP63]. Simple and widely used, equivalent to add-one smoothing.
Schürmann-Grassberger Prior (α = 1/K = "sch-grass"): Adaptive prior that scales with the alphabet size. Particularly effective for large alphabets.
Minimax Prior (α = √N/K = "min-max"): Minimises the maximum expected loss. Balances between sample size and alphabet size.

Attributes:

*dataarray_like: The data used to estimate the entropy.
alphafloat: The concentration parameter α of the Dirichlet prior.
Kint, optional: The support size. If not provided, uses the observed support size.

property bayes_probs#

property dist_dict#: Return the Bayesian distribution dictionary for JSD.

class infomeasure.estimators.entropy.BonachelaEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Bonachela (Bonachela-Hinrichsen-Muñoz) entropy estimator for discrete data.

The Bonachela estimator computes the Shannon entropy using the formula from [BHM08]:

\[\hat{H}_{B} = \frac{1}{N+2} \sum_{i=1}^{K} \left( (n_i + 1) \sum_{j=n_i + 2}^{N+2} \frac{1}{j} \right)\]

where \(n_i\) are the counts for each unique value, \(K\) is the number of unique values, and \(N\) is the total number of observations.

This estimator is specially designed to provide a compromise between low bias and small statistical errors for short data series, particularly when the data sets are small and the probabilities are not close to zero.

Attributes:

*dataarray_like: The data used to estimate the entropy.

class infomeasure.estimators.entropy.ChaoShenEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Chao-Shen entropy estimator.

\[\hat{H}_{CS} = - \sum_{i=1}^{K} \frac{\hat{p}_i^{CS} \log \hat{p}_i^{CS}}{1 - (1 - \hat{p}_i^{ML} C)^N}\]

where

\[\hat{p}_i^{CS} = C \cdot \hat{p}_i^{ML}\]

and \(C = 1 - \frac{f_1}{N}\) is the estimated coverage, \(f_1\) is the number of singletons (species observed exactly once), \(\hat{p}_i^{ML}\) is the maximum likelihood probability estimate, \(N\) is the sample size, and \(K\) is the number of observed species [CS03]. The Chao-Shen estimator provides a bias-corrected estimate of Shannon entropy that accounts for unobserved species through coverage estimation.

Attributes:

*dataarray_like: The data used to estimate the entropy.

class infomeasure.estimators.entropy.ChaoWangJostEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Advanced bias-corrected Shannon entropy estimator using coverage estimation.

The Chao-Wang-Jost estimator provides improved entropy estimates for incomplete sampling scenarios by accounting for unobserved species through sophisticated statistical corrections. This estimator is particularly valuable when dealing with ecological data, text analysis, or any discrete distribution where the sample may not capture all possible outcomes.

The Chao-Wang-Jost estimator addresses the systematic underestimation of entropy in finite samples by applying sophisticated statistical corrections. Through coverage estimation using singleton and doubleton counts, it provides reliable entropy estimates even with small or incomplete samples. Based on species accumulation theory and Good-Turing estimation principles, this approach is particularly valuable when the sample doesn’t capture all possible outcomes, such as in ecological diversity studies with incomplete species sampling or text analysis where vocabulary may be incompletely observed. The estimator is especially useful when standard entropy estimators show systematic bias due to sample size limitations.

Standard entropy estimators often underestimate diversity in finite samples, especially when the sampling is incomplete. This estimator overcomes this limitation by leveraging information from rare species (singletons and doubletons) to estimate sample coverage and correct for unobserved species. The theoretical foundation in species accumulation curves and Good-Turing frequency estimation provide a robust statistical framework for addressing sampling bias issues.

Mathematical Foundation:

The estimator combines observed entropy with a correction term based on coverage estimation:

\[\hat{H}_{\text{CWJ}} = \sum_{1 \leq n_i \leq N-1} \frac{n_i}{N} \left(\sum_{k=n_i}^{N-1} \frac{1}{k} \right) + \frac{f_1}{N} (1 - A)^{-N + 1} \left\{ - \log(A) - \sum_{r=1}^{N-1} \frac{1}{r} (1 - A)^r \right\}\]

where the coverage parameter \(A\) is estimated as:

\[\begin{split}A = \begin{cases} \frac{2 f_2}{(N-1) f_1 + 2 f_2} \, & \text{if} \, f_2 > 0 \\ \frac{2}{(N-1)(f_1 - 1) + 2} \, & \text{if} \, f_2 = 0, \; f_1 \neq 0 \\ 1, & \text{if} \, f_1 = f_2 = 0 \end{cases}\end{split}\]

Here, \(f_1\) represents the number of singletons (species observed exactly once) and \(f_2\) the number of doubletons (species observed exactly twice) in the sample [CWJ13].

Attributes:

*dataarray_like: The data used to estimate the entropy.

See also

infomeasure.estimators.functional.entropy: Functional interface for entropy calculation
infomeasure.estimators.entropy.discrete.DiscreteEntropyEstimator: Standard maximum likelihood entropy estimator

Notes

The algorithm is adapted from the entropart R library [MH15]
The correction becomes negligible when samples are complete (\(f_1 = f_2 = 0\))

Examples

>>> import infomeasure as im
>>>
>>> # Basic usage with incomplete sampling scenario
>>> data = [1, 1, 2, 3, 4, 5]  # Many singletons suggest incomplete sampling
>>> h_cwj = im.entropy(data, approach="chao_wang_jost", base=2)
>>> h_standard = im.entropy(data, approach="discrete", base=2)
>>> print(f"Chao-Wang-Jost: {h_cwj:.3f} bits")
Chao-Wang-Jost: 3.635 bits
>>> print(f"Standard: {h_standard:.3f} bits")
Standard: 2.252 bits
>>>
>>> # Ecological diversity example
>>> species_counts = [1, 1, 1, 2, 2, 3, 5, 8]  # Species abundance data
>>> diversity = im.entropy(species_counts, approach="cwj", base="e")
>>> print(f"Species diversity: {diversity:.3f} nats")
Species diversity: 2.054 nats

class infomeasure.estimators.entropy.DiscreteEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Standard Shannon entropy estimator for discrete data using maximum likelihood.

The discrete entropy estimator computes the Shannon entropy using the classical maximum likelihood approach:

\[\hat{H} = -\sum_{i=1}^{K} \hat{p}_i \log \hat{p}_i\]

where \(\hat{p}_i = \frac{n_i}{N}\) are the empirical probabilities, \(n_i\) are the counts for each unique value \(i\), \(K\) is the number of unique values, and \(N\) is the total number of observations.

This is the most fundamental entropy estimator and serves as the baseline for comparison with other bias-corrected estimators. While it provides an asymptotically unbiased estimate of the true entropy, it can exhibit significant bias for small sample sizes, particularly when the number of unique values is large relative to the sample size.

The estimator is suitable for:

Large datasets where bias is minimal
Baseline comparisons with bias-corrected estimators
Applications where computational simplicity is preferred
Well-sampled distributions with sufficient observations per unique value

For small sample sizes or distributions with many rare events, consider using bias-corrected estimators such as ChaoShenEntropyEstimator, BonachelaEntropyEstimator, or ZhangEntropyEstimator.

Attributes:

*dataarray_like: The data used to estimate the entropy. For joint entropy, multiple arrays can be provided.
basefloat or str, default=Config.get(“base”): The logarithm base for entropy calculation. Common values are 2 (bits), 10 (dits), or ‘e’ (nats).

Examples

>>> import infomeasure as im
>>> # Simple entropy calculation
>>> data = [1, 1, 2, 3, 3, 4, 5]
>>> entropy_value = im.entropy(data, approach="discrete")
>>> print(f"Entropy: {entropy_value:.3f} nats")
Entropy: 1.550 nats
>>> # Local values
>>> estimator = im.estimator(data, measure="h", approach="discrete")
>>> estimator.local_vals()
array([1.25276297, 1.25276297, 1.94591015, 1.25276297, 1.25276297,
   1.94591015, 1.94591015])

property dist_dict#: Return the distribution dictionary for JSD.

class infomeasure.estimators.entropy.GrassbergerEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Discrete Grassberger entropy estimator.

\[\hat{H}_{\text{Gr88}} = \sum_i \frac{n_i}{H} \left(\log(N) - \psi(n_i) - \frac{(-1)^{n_i}}{n_i + 1} \right)\]

\(\hat{H}_{\text{Gr88}}\) is the Grassberger entropy, where \(n_i\) are the counts, \(H\) is the total number of observations \(N\), and \(\psi\) is the digamma function [Gra08, Gra88].

Attributes:

*dataarray_like: The data used to estimate the entropy.

class infomeasure.estimators.entropy.KernelEntropyEstimator(*data, bandwidth: float | int, kernel: str, workers: int = 1, base: int | float | str = 'e')[source]#

Bases: WorkersMixin, EntropyEstimator

Kernel entropy estimator for continuous data using Kernel Density Estimation (KDE).

The kernel entropy estimator computes the differential Shannon entropy by estimating the probability density function using kernel density estimation:

\[\hat{H}(X) = -\int \hat{f}(x) \log \hat{f}(x) \, dx \approx -\frac{1}{N} \sum_{i=1}^{N} \log \hat{f}(x_i)\]

where \(\hat{f}(x)\) is the kernel density estimate:

\[\hat{f}(x) = \frac{1}{N h^d} \sum_{i=1}^{N} K\left(\frac{x - x_i}{h}\right)\]

with \(K(\cdot)\) being the kernel function, \(h\) the bandwidth parameter, \(d\) the dimensionality, and \(N\) the number of data points.

For joint entropy of multiple variables, the estimator concatenates the variables into a single multivariate space and applies the same KDE approach.

The estimator supports both Gaussian and box (uniform) kernels. The choice of bandwidth is critical: small values can lead to under-smoothing and overfitting, while large values may over-smooth the data and obscure important features [GP25, Sil86].

Parameters:

*dataarray_like

The continuous data used to estimate the entropy. For univariate entropy, pass a single array. For joint entropy, pass multiple arrays.

bandwidthfloat | int

The bandwidth parameter for the kernel. Controls the smoothness of the density estimate.

kernelstr

Type of kernel to use. Supported options are:

'gaussian': Gaussian (normal) kernel
'box': Box (uniform) kernel

Compatible with the KDE implementation kde_probability_density_function().

workersint, optional

Number of workers to use for parallel processing. Default is 1 (no parallelization). If set to -1, all available CPU cores will be used.

basefloat | str, optional

Logarithm base for entropy calculation. Default is from global configuration.

Attributes:

*dataarray_like: The data used to estimate the entropy.
bandwidthfloat | int: The bandwidth for the kernel.
kernelstr: Type of kernel to use.
workersint: Number of workers to use for parallel processing.

Returns:

array_like: Local entropy values for each data point when calling entropy calculation methods. The mean of these values gives the overall entropy estimate.

See also

infomeasure.estimators.utils.kde.kde_probability_density_function: Underlying KDE implementation
infomeasure.estimators.entropy.discrete.DiscreteEntropyEstimator: For discrete data entropy estimation

Notes

Bandwidth Selection: The bandwidth parameter critically affects the quality of the entropy estimate. A small bandwidth can lead to under-sampling and high variance, while a large bandwidth may over-smooth the data, obscuring important details and introducing bias.

Kernel Choice:

Gaussian kernels provide smooth density estimates and are theoretically well-founded
Box kernels are computationally efficient and provide non-parametric estimates

Computational Complexity: The algorithm has O(N²) complexity for box kernels using KDTree queries, and varies for Gaussian kernels depending on the implementation.

Cross-entropy: Supported between two distributions by evaluating the density of the second distribution at points from the first distribution.

Examples

>>> import infomeasure as im
>>> from numpy.random import default_rng
>>> rng = default_rng(281769)
>>> # Generate sample data
>>> data = rng.normal(0, 1, 1000)
>>>
>>> # Create estimator
>>> estimator = im.estimator(data, measure="h", approach="kernel", bandwidth=0.5, kernel='gaussian')
>>>
>>> # Calculate entropy
>>> estimator.result()
np.float64(1.366015332652949)
>>> # Local values
>>> estimator.local_vals()
array([1.54017083, 1.35855839, 0.97949819, 0.97333173, 2.62084886,
   ...
   1.08174049, 0.97418054, 1.88055967, 0.99614516, 0.98548583])

class infomeasure.estimators.entropy.KozachenkoLeonenkoEntropyEstimator(*data, k: int = 4, ksg_id: int = 1, noise_level=1e-10, minkowski_p=inf, base: int | float | str = 'e')[source]#

Bases: RandomGeneratorMixin, EntropyEstimator

Kozachenko-Leonenko entropy estimator for continuous data.

The Kozachenko-Leonenko estimator computes the Shannon entropy of continuous data using nearest neighbor distances. The estimator is based on the method from [KL87] and follows the implementation approach described in [KSG11].

\[\hat{H}_{KL} = -\psi(k) + \psi(N) + \log(c_d) + \frac{d}{N} \sum_{i=1}^{N} \log(2\rho_{k,i})\]

where \(\psi\) is the digamma function, \(k\) is the number of nearest neighbors, \(N\) is the number of data points, \(d\) is the dimensionality, \(c_d\) is the volume of the \(d\)-dimensional unit ball for the chosen Minkowski norm, and \(\rho_{k,i}\) is the distance to the \(k\)-th nearest neighbor of point \(i\).

This estimator is particularly suitable for continuous multivariate data and provides asymptotically unbiased estimates of differential entropy. The method works by exploiting the relationship between nearest neighbor distances and local density, making it effective for high-dimensional data where traditional histogram-based methods fail.

Parameters:

*dataarray_like: The continuous data used to estimate the entropy. For multivariate data, each variable should be a column.
kint, default=4: The number of nearest neighbors to consider. Higher values provide more stable estimates but may introduce bias. The default value of 4 is recommended by [KSG11].
noise_levelfloat, default=1e-10: The standard deviation of Gaussian noise added to the data to avoid issues with zero distances between identical points. Set to 0 to disable noise addition.
minkowski_pfloat, default=inf: The power parameter for the Minkowski metric used in distance calculations. Common values are 2 (Euclidean distance) and inf (maximum norm/Chebyshev distance). Must satisfy \(1 \leq p \leq \infty\).
ksg_idint, default=1: The KSG estimator variant to use (1 or 2). Type I uses the standard formula. Type II uses a modified formula with \(\psi(k) - 1/k\).
baseLogBaseType, default=Config.get(“base”): The logarithm base for entropy calculation. Can be 2, 10, “e”, or any positive number.

Attributes:

*datatuple[array_like]: The processed data used to estimate the entropy, converted to 2D arrays.
kint: The number of nearest neighbors to consider.
noise_levelfloat: The standard deviation of the Gaussian noise added to the data.
minkowski_pfloat: The power parameter for the Minkowski metric.
ksg_idint: The KSG estimator variant to use.

Raises:

ValueError: If the number of nearest neighbors is not a positive integer.
ValueError: If the noise level is negative.
ValueError: If the Minkowski power parameter is invalid (not in range [1, ∞]).

Notes

The choice of the number of nearest neighbors \(k\) affects the bias-variance tradeoff of the estimator. Smaller values of \(k\) reduce bias but increase variance, while larger values have the opposite effect. The default value of \(k=4\) provides a good balance for most applications.

The noise addition helps handle datasets with repeated values or points that are exactly identical, which would otherwise result in zero distances and numerical issues. The noise level should be small enough not to significantly alter the underlying distribution.

For high-dimensional data, the curse of dimensionality may affect the estimator’s performance, as nearest neighbor distances become less informative. In such cases, dimensionality reduction or alternative entropy estimation methods may be preferable.

Examples

>>> import numpy as np
>>> import infomeasure as im
>>>
>>> # Generate 2D Gaussian data
>>> np.random.seed(176250)
>>> data = np.random.multivariate_normal([0, 0], [[1, 0.5], [0.5, 1]], 1000)
>>>
>>> # Estimate entropy
>>> estimator = im.estimator(data, measure="h", approach="kl", k=4)
>>> entropy_value = estimator.result()
>>> print(f"Estimated entropy: {entropy_value:.3f}")
Estimated entropy: 2.678
>>> print(f"Local values: {estimator.local_vals()}")
Local values: [ 3.15330798  2.02688591  2.52250064  2.95236651  3.58801879  1.42033673
    ...
    2.91254223  1.92823136  3.63647704  2.05589055]
>>> # Use different distance metric
>>> estimator_euclidean = KozachenkoLeonenkoEntropyEstimator(data, k=4, minkowski_p=2)
>>> entropy_euclidean = estimator_euclidean.entropy()
np.float64(2.6772465397252208)

class infomeasure.estimators.entropy.MillerMadowEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Discrete Miller-Madow entropy estimator.

\[\hat{H}_{\tiny{MM}} = \hat{H}_{\tiny{MLE}} + \frac{K - 1}{2N}\]

\(\hat{H}_{\tiny{MM}}\) is the Miller-Madow entropy, where \(\hat{H}_{\tiny{MLE}}\) is the maximum likelihood entropy (DiscreteEntropyEstimator). \(K\) is the number of unique values in the data, and \(N\) is the number of observations.

Attributes:

*dataarray_like: The data used to estimate the entropy.

class infomeasure.estimators.entropy.NsbEntropyEstimator(*data, K: int = None, base: int | float | str = 'e')[source]#

Bases: DiscreteHEstimator

NSB (Nemenman-Shafee-Bialek) entropy estimator.

The NSB estimator provides a Bayesian estimate of Shannon entropy for discrete data using the Nemenman, Shafee, Bialek algorithm. This estimator is particularly effective for undersampled data where traditional estimators may be biased.

The NSB estimate is computed as:

\[\hat{H}^{\text{NSB}} = \frac{ \int_0^{\ln(K)} d\xi \, \rho(\xi, \textbf{n}) \langle H^m \rangle_{\beta (\xi)} } { \int_0^{\ln(K)} d\xi \, \rho(\xi\mid \textbf{n})}\]

where

\[\rho(\xi \mid \textbf{n}) = \mathcal{P}(\beta (\xi)) \frac{ \Gamma(\kappa(\xi))}{\Gamma(N + \kappa(\xi))} \prod_{i=1}^K \frac{\Gamma(n_i + \beta(\xi))}{\Gamma(\beta(\xi))}\]

The algorithm uses numerical integration to compute the Bayesian posterior over possible entropy values, providing a principled approach to entropy estimation that accounts for sampling uncertainty [NSB02].

If there are no coincidences in the data (all observations are unique), NSB returns NaN as the estimator requires repeated observations to function properly.

Parameters:

*dataarray_like: The data used to estimate the entropy.
Kint, optional: The support size. If not provided, uses the observed support size.
baseLogBaseType, default=Config.get(“base”): The logarithm base for entropy calculation.

Attributes:

*dataarray_like: The data used to estimate the entropy.

Notes

The NSB estimator is computationally intensive as it requires numerical integration and optimisation. For large datasets or when computational efficiency is critical, consider using the asymptotic NSB (ANSB) estimator AnsbEntropyEstimator instead.

The estimator assumes a uniform prior over the space of possible probability distributions and uses Bayesian inference to estimate the entropy.

Examples

>>> import infomeasure as im
>>> data = [1, 2, 3, 4, 5, 1, 2]  # Some repeated values
>>> im.entropy(data, approach='nsb')
np.float64(1.4526460202102247)

class infomeasure.estimators.entropy.OrdinalEntropyEstimator(*data, embedding_dim: int, step_size: int = 1, stable: bool = False, base: int | float | str = 'e')[source]#

Bases: EntropyEstimator

Estimator for the Ordinal / Permutation entropy.

The Ordinal entropy is a measure of the complexity of a time series. The input data needs to be comparable, i.e., the data should be ordinal, as the relative frequencies are calculated. For a given embedding_dim (length of considered subsequences), all \(n!\) possible permutations are considered and their relative frequencies are calculated [BP02].

Embedding delay is not supported natively.

Attributes:

*dataarray_like: The data used to estimate the entropy.
embedding_dimint: The size of the permutation patterns.
step_sizeint, optional: The step size for the sliding windows (delay). Default is 1.
stablebool, optional: If True, when sorting the data, the embedding_dim of equal elements is preserved. This can be useful for reproducibility and testing, but might be slower.

Raises:

ValueError: If the embedding_dim is negative or not an integer.
ValueError: If the embedding_dim is too large for the given data.
TypeError: If the data are not 1d array-like(s).

Notes

The ordinality will be determined via numpy.argsort().
If embedding_dim is set to 1, the entropy is always 0.

class infomeasure.estimators.entropy.RenyiEntropyEstimator(*data, k: int = 4, alpha: float | int = None, base: int | float | str = 'e')[source]#

Bases: EntropyEstimator

Estimator for the Rényi entropy.

Attributes:

*dataarray_like: The data used to estimate the entropy.
kint: The number of nearest neighbors used in the estimation.
alphafloat | int: The Rényi parameter, order or exponent. Sometimes denoted as \(\alpha\) or \(q\).

Raises:

ValueError: If the Renyi parameter is not a positive number.
ValueError: If the number of nearest neighbors is not a positive integer.

Notes

The Rényi entropy is a generalization of Shannon entropy, where the small values of probabilities are emphasized for \(\alpha < 1\), and higher probabilities are emphasized for \(\alpha > 1\). For \(\alpha = 1\), it reduces to Shannon entropy. The Rényi-Entropy class can be particularly interesting for systems where additivity (in Shannon sense) is not always preserved, especially in nonlinear complex systems, such as when dealing with long-range forces.

class infomeasure.estimators.entropy.ShrinkEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Shrinkage (James-Stein) entropy estimator.

This estimator applies James-Stein shrinkage to the probability estimates before computing entropy, which can reduce bias in small sample scenarios. The shrinkage probabilities are calculated as:

\[\hat{p}_x^{\text{SHR}} = \lambda t_x + (1 - \lambda) \hat{p}_x^{\text{ML}}\]

where \(\hat{p}_x^{\text{ML}}\) are the maximum likelihood probability estimates, \(t_x = 1/K\) is the uniform target distribution, and the shrinkage parameter \(\lambda\) is given by:

\[\lambda = \frac{ 1 - \sum_{x=1}^{K} (\hat{p}_x^{\text{SHR}})^2}{(n-1) \sum_{x=1}^K (t_x - \hat{p}_x^{\text{ML}})^2}\]

The entropy is then computed using these shrinkage-corrected probabilities.

Based on the implementation in the R package entropy [HS09].

Attributes:

*dataarray_like: The data used to estimate the entropy.

property dist_dict#: Dictionary of shrinkage probabilities for each unique value. Used by JSD.

class infomeasure.estimators.entropy.TsallisEntropyEstimator(*data, k: int = 4, q: float | int = None, base: int | float | str = 'e')[source]#

Bases: EntropyEstimator

Estimator for the Tsallis entropy.

Attributes:

*dataarray_like: The data used to estimate the entropy.
kint: The number of nearest neighbors used in the estimation.
qfloat: The Tsallis parameter, order or exponent. Sometimes denoted as \(q\), analogous to the Rényi parameter \(\alpha\).

Raises:

ValueError: If the Tsallis parameter is not a positive number.
ValueError: If the number of nearest neighbors is not a positive integer.

Notes

In the \(q \to 1\) limit, the Jackson sum (q-additivity) reduces to ordinary summation, and the Tallis entropy reduces to Shannon Entropy. This class of entropy measure is in particularly useful in the study in connection with long-range correlated systems and with non-equilibrium phenomena.

class infomeasure.estimators.entropy.ZhangEntropyEstimator(*args, **kwargs)[source]#

Bases: DiscreteHEstimator

Zhang entropy estimator for discrete data.

The Zhang estimator computes the Shannon entropy using the recommended definition from [GZZ13]:

\[\hat{H}_Z = \sum_{i=1}^K \hat{p}_i \sum_{v=1}^{N - n_i} \frac{1}{v} \prod_{j=0}^{v-1} \left( 1 + \frac{1 - n_i}{N - 1 - j} \right)\]

where \(\hat{p}_i\) are the empirical probabilities, \(n_i\) are the counts for each unique value, \(K\) is the number of unique values, and \(N\) is the total number of observations.

The actual algorithm implementation follows the fast calculation approach from [LCBFiC17].

Attributes:

*dataarray_like: The data used to estimate the entropy.

infomeasure.estimators.entropy package

Contents

infomeasure.estimators.entropy package#

Submodules#

infomeasure.estimators.entropy.ansb module#

infomeasure.estimators.entropy.bayes module#

infomeasure.estimators.entropy.bonachela module#

infomeasure.estimators.entropy.chao_shen module#

infomeasure.estimators.entropy.chao_wang_jost module#

infomeasure.estimators.entropy.discrete module#

infomeasure.estimators.entropy.grassberger module#

infomeasure.estimators.entropy.kernel module#

infomeasure.estimators.entropy.kozachenko_leonenko module#

infomeasure.estimators.entropy.miller_madow module#

infomeasure.estimators.entropy.nsb module#

infomeasure.estimators.entropy.ordinal module#

infomeasure.estimators.entropy.renyi module#

infomeasure.estimators.entropy.shrink module#

infomeasure.estimators.entropy.tsallis module#

infomeasure.estimators.entropy.zhang module#

Module contents#