infomeasure.estimators.entropy package#
Submodules#
infomeasure.estimators.entropy.ansb module#
Module for the Asymptotic NSB entropy estimator.
- class infomeasure.estimators.entropy.ansb.AnsbEntropyEstimator(*data, K: int = None, undersampled: float = 0.1, base: int | float | str = 'e')[source]#
Bases:
DiscreteHEstimatorAsymptotic NSB entropy estimator.
The Asymptotic NSB (ANSB) estimator provides entropy estimation for extremely undersampled discrete data where the number of unique values K is comparable to the sample size N.
\[\hat{H}_{\text{ANSB}} = (C_\gamma - \log(2)) + 2 \log(N) - \psi(\Delta)\]where \(C_\gamma \approx 0.5772156649\dots\) is Euler’s constant, \(\psi\) is the digamma function, and \(\Delta = N - K\) is the number of coincidences (repeated observations) in the data.
This estimator is specifically designed for the extremely undersampled regime where \(K \sim N\) and diverges with N when the data is well-sampled. The ANSB estimator requires that \(N/K \to 0\), which is checked by default using the
undersampledparameter [NBvS04].If there are no coincidences in the data (\(\Delta = 0\)), ANSB returns NaN as the estimator is undefined in this case.
- Parameters:
- *dataarray_like
The data used to estimate the entropy.
- K
int,optional The support size. If not provided, uses the observed support size.
- undersampled
float, default=0.1 Maximum allowed ratio N/K to consider data sufficiently undersampled. A warning is issued if this threshold is exceeded.
- base
LogBaseType, default=Config.get(“base”) The logarithm base for entropy calculation.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
Notes
The ANSB estimator is based on the asymptotic expansion of the NSB estimator for the case of extreme undersampling. It provides a computationally efficient alternative to the full NSB estimator when \(K \sim N\).
Examples
>>> import infomeasure as im >>> data = [1, 2, 3, 4, 5, 1, 2] # Some repeated values >>> im.entropy(data, approach='ansb') np.float64(3.353104447353747)
infomeasure.estimators.entropy.bayes module#
Module for the Bayesian entropy estimator.
- class infomeasure.estimators.entropy.bayes.BayesEntropyEstimator(*data, alpha: float | str, K: int = None, base: int | float | str = 'e')[source]#
Bases:
DiscreteHEstimatorBayesian entropy estimator.
Computes an estimate of Shannon entropy using Bayesian probability estimates with a Dirichlet prior characterized by concentration parameter α. This approach provides a principled way to handle sparse data and incorporate prior knowledge about the probability distribution.
The Bayesian probabilities are calculated as:
\[p_k^{\text{Bayes}} = \frac{n_k + \alpha}{N + K \alpha}\]where \(n_k\) is the count of symbol \(k\), \(N\) is the total number of observations, \(K\) is the support size (number of unique symbols), and \(\alpha\) is the concentration parameter of the Dirichlet prior.
The entropy is then \(-\sum p_k^{\text{Bayes}} \log p_k^{\text{Bayes}}\), same as the maximum likelihood entropy estimator, also supporting local entropy values.
Concentration Parameter Choices
The concentration parameter α controls the strength of the prior belief in uniform distribution. Several well-established choices are available:
- Jeffreys Prior (
α = 0.5 = "jeffrey") Non-informative prior that is invariant under reparameterization. Provides good performance for most applications [KT81].
- Laplace Prior (
α = 1.0 = "laplace") Uniform prior that adds one pseudocount to each symbol [BP63]. Simple and widely used, equivalent to add-one smoothing.
- Schürmann-Grassberger Prior (
α = 1/K = "sch-grass") Adaptive prior that scales with the alphabet size. Particularly effective for large alphabets.
- Minimax Prior (
α = √N/K = "min-max") Minimises the maximum expected loss. Balances between sample size and alphabet size.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- alpha
float The concentration parameter α of the Dirichlet prior.
- K
int,optional The support size. If not provided, uses the observed support size.
- property bayes_probs#
- property dist_dict#
Return the Bayesian distribution dictionary for JSD.
- Jeffreys Prior (
infomeasure.estimators.entropy.bonachela module#
Module for the Bonachela entropy estimator.
- class infomeasure.estimators.entropy.bonachela.BonachelaEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorBonachela (Bonachela-Hinrichsen-Muñoz) entropy estimator for discrete data.
The Bonachela estimator computes the Shannon entropy using the formula from [BHM08]:
\[\hat{H}_{B} = \frac{1}{N+2} \sum_{i=1}^{K} \left( (n_i + 1) \sum_{j=n_i + 2}^{N+2} \frac{1}{j} \right)\]where \(n_i\) are the counts for each unique value, \(K\) is the number of unique values, and \(N\) is the total number of observations.
This estimator is specially designed to provide a compromise between low bias and small statistical errors for short data series, particularly when the data sets are small and the probabilities are not close to zero.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
infomeasure.estimators.entropy.chao_shen module#
Module for the Chao-Shen entropy estimator.
- class infomeasure.estimators.entropy.chao_shen.ChaoShenEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorChao-Shen entropy estimator.
\[\hat{H}_{CS} = - \sum_{i=1}^{K} \frac{\hat{p}_i^{CS} \log \hat{p}_i^{CS}}{1 - (1 - \hat{p}_i^{ML} C)^N}\]where
\[\hat{p}_i^{CS} = C \cdot \hat{p}_i^{ML}\]and \(C = 1 - \frac{f_1}{N}\) is the estimated coverage, \(f_1\) is the number of singletons (species observed exactly once), \(\hat{p}_i^{ML}\) is the maximum likelihood probability estimate, \(N\) is the sample size, and \(K\) is the number of observed species [CS03]. The Chao-Shen estimator provides a bias-corrected estimate of Shannon entropy that accounts for unobserved species through coverage estimation.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
infomeasure.estimators.entropy.chao_wang_jost module#
Module for the Chao Wang Jost entropy estimator.
- class infomeasure.estimators.entropy.chao_wang_jost.ChaoWangJostEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorAdvanced bias-corrected Shannon entropy estimator using coverage estimation.
The Chao-Wang-Jost estimator provides improved entropy estimates for incomplete sampling scenarios by accounting for unobserved species through sophisticated statistical corrections. This estimator is particularly valuable when dealing with ecological data, text analysis, or any discrete distribution where the sample may not capture all possible outcomes.
The Chao-Wang-Jost estimator addresses the systematic underestimation of entropy in finite samples by applying sophisticated statistical corrections. Through coverage estimation using singleton and doubleton counts, it provides reliable entropy estimates even with small or incomplete samples. Based on species accumulation theory and Good-Turing estimation principles, this approach is particularly valuable when the sample doesn’t capture all possible outcomes, such as in ecological diversity studies with incomplete species sampling or text analysis where vocabulary may be incompletely observed. The estimator is especially useful when standard entropy estimators show systematic bias due to sample size limitations.
Standard entropy estimators often underestimate diversity in finite samples, especially when the sampling is incomplete. This estimator overcomes this limitation by leveraging information from rare species (singletons and doubletons) to estimate sample coverage and correct for unobserved species. The theoretical foundation in species accumulation curves and Good-Turing frequency estimation provide a robust statistical framework for addressing sampling bias issues.
Mathematical Foundation:
The estimator combines observed entropy with a correction term based on coverage estimation:
\[\hat{H}_{\text{CWJ}} = \sum_{1 \leq n_i \leq N-1} \frac{n_i}{N} \left(\sum_{k=n_i}^{N-1} \frac{1}{k} \right) + \frac{f_1}{N} (1 - A)^{-N + 1} \left\{ - \log(A) - \sum_{r=1}^{N-1} \frac{1}{r} (1 - A)^r \right\}\]where the coverage parameter \(A\) is estimated as:
\[\begin{split}A = \begin{cases} \frac{2 f_2}{(N-1) f_1 + 2 f_2} \, & \text{if} \, f_2 > 0 \\ \frac{2}{(N-1)(f_1 - 1) + 2} \, & \text{if} \, f_2 = 0, \; f_1 \neq 0 \\ 1, & \text{if} \, f_1 = f_2 = 0 \end{cases}\end{split}\]Here, \(f_1\) represents the number of singletons (species observed exactly once) and \(f_2\) the number of doubletons (species observed exactly twice) in the sample [CWJ13].
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
See also
infomeasure.estimators.functional.entropyFunctional interface for entropy calculation
infomeasure.estimators.entropy.discrete.DiscreteEntropyEstimatorStandard maximum likelihood entropy estimator
Notes
The algorithm is adapted from the entropart R library [MH15]
The correction becomes negligible when samples are complete (\(f_1 = f_2 = 0\))
Examples
>>> import infomeasure as im >>> >>> # Basic usage with incomplete sampling scenario >>> data = [1, 1, 2, 3, 4, 5] # Many singletons suggest incomplete sampling >>> h_cwj = im.entropy(data, approach="chao_wang_jost", base=2) >>> h_standard = im.entropy(data, approach="discrete", base=2) >>> print(f"Chao-Wang-Jost: {h_cwj:.3f} bits") Chao-Wang-Jost: 3.635 bits >>> print(f"Standard: {h_standard:.3f} bits") Standard: 2.252 bits >>> >>> # Ecological diversity example >>> species_counts = [1, 1, 1, 2, 2, 3, 5, 8] # Species abundance data >>> diversity = im.entropy(species_counts, approach="cwj", base="e") >>> print(f"Species diversity: {diversity:.3f} nats") Species diversity: 2.054 nats
infomeasure.estimators.entropy.discrete module#
Module for the discrete entropy estimator.
- class infomeasure.estimators.entropy.discrete.DiscreteEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorStandard Shannon entropy estimator for discrete data using maximum likelihood.
The discrete entropy estimator computes the Shannon entropy using the classical maximum likelihood approach:
\[\hat{H} = -\sum_{i=1}^{K} \hat{p}_i \log \hat{p}_i\]where \(\hat{p}_i = \frac{n_i}{N}\) are the empirical probabilities, \(n_i\) are the counts for each unique value \(i\), \(K\) is the number of unique values, and \(N\) is the total number of observations.
This is the most fundamental entropy estimator and serves as the baseline for comparison with other bias-corrected estimators. While it provides an asymptotically unbiased estimate of the true entropy, it can exhibit significant bias for small sample sizes, particularly when the number of unique values is large relative to the sample size.
The estimator is suitable for:
Large datasets where bias is minimal
Baseline comparisons with bias-corrected estimators
Applications where computational simplicity is preferred
Well-sampled distributions with sufficient observations per unique value
For small sample sizes or distributions with many rare events, consider using bias-corrected estimators such as
ChaoShenEntropyEstimator,BonachelaEntropyEstimator, orZhangEntropyEstimator.- Attributes:
- *dataarray_like
The data used to estimate the entropy. For joint entropy, multiple arrays can be provided.
- base
floatorstr, default=Config.get(“base”) The logarithm base for entropy calculation. Common values are 2 (bits), 10 (dits), or ‘e’ (nats).
Examples
>>> import infomeasure as im >>> # Simple entropy calculation >>> data = [1, 1, 2, 3, 3, 4, 5] >>> entropy_value = im.entropy(data, approach="discrete") >>> print(f"Entropy: {entropy_value:.3f} nats") Entropy: 1.550 nats >>> # Local values >>> estimator = im.estimator(data, measure="h", approach="discrete") >>> estimator.local_vals() array([1.25276297, 1.25276297, 1.94591015, 1.25276297, 1.25276297, 1.94591015, 1.94591015])
- property dist_dict#
Return the distribution dictionary for JSD.
infomeasure.estimators.entropy.grassberger module#
Module for the discrete Grassberger entropy estimator.
- class infomeasure.estimators.entropy.grassberger.GrassbergerEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorDiscrete Grassberger entropy estimator.
\[\hat{H}_{\text{Gr88}} = \sum_i \frac{n_i}{H} \left(\log(N) - \psi(n_i) - \frac{(-1)^{n_i}}{n_i + 1} \right)\]\(\hat{H}_{\text{Gr88}}\) is the Grassberger entropy, where \(n_i\) are the counts, \(H\) is the total number of observations \(N\), and \(\psi\) is the digamma function [Gra08, Gra88].
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
infomeasure.estimators.entropy.kernel module#
Module for the kernel entropy estimator.
- class infomeasure.estimators.entropy.kernel.KernelEntropyEstimator(*data, bandwidth: float | int, kernel: str, workers: int = 1, base: int | float | str = 'e')[source]#
Bases:
WorkersMixin,EntropyEstimatorKernel entropy estimator for continuous data using Kernel Density Estimation (KDE).
The kernel entropy estimator computes the differential Shannon entropy by estimating the probability density function using kernel density estimation:
\[\hat{H}(X) = -\int \hat{f}(x) \log \hat{f}(x) \, dx \approx -\frac{1}{N} \sum_{i=1}^{N} \log \hat{f}(x_i)\]where \(\hat{f}(x)\) is the kernel density estimate:
\[\hat{f}(x) = \frac{1}{N h^d} \sum_{i=1}^{N} K\left(\frac{x - x_i}{h}\right)\]with \(K(\cdot)\) being the kernel function, \(h\) the bandwidth parameter, \(d\) the dimensionality, and \(N\) the number of data points.
For joint entropy of multiple variables, the estimator concatenates the variables into a single multivariate space and applies the same KDE approach.
The estimator supports both Gaussian and box (uniform) kernels. The choice of bandwidth is critical: small values can lead to under-smoothing and overfitting, while large values may over-smooth the data and obscure important features [GP25, Sil86].
- Parameters:
- *dataarray_like
The continuous data used to estimate the entropy. For univariate entropy, pass a single array. For joint entropy, pass multiple arrays.
- bandwidth
float|int The bandwidth parameter for the kernel. Controls the smoothness of the density estimate.
- kernel
str Type of kernel to use. Supported options are:
'gaussian': Gaussian (normal) kernel'box': Box (uniform) kernel
Compatible with the KDE implementation
kde_probability_density_function().- workers
int,optional Number of workers to use for parallel processing. Default is 1 (no parallelization). If set to -1, all available CPU cores will be used.
- base
float|str,optional Logarithm base for entropy calculation. Default is from global configuration.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- bandwidth
float|int The bandwidth for the kernel.
- kernel
str Type of kernel to use.
- workers
int Number of workers to use for parallel processing.
- Returns:
- array_like
Local entropy values for each data point when calling entropy calculation methods. The mean of these values gives the overall entropy estimate.
See also
infomeasure.estimators.utils.kde.kde_probability_density_functionUnderlying KDE implementation
infomeasure.estimators.entropy.discrete.DiscreteEntropyEstimatorFor discrete data entropy estimation
Notes
Bandwidth Selection: The bandwidth parameter critically affects the quality of the entropy estimate. A small bandwidth can lead to under-sampling and high variance, while a large bandwidth may over-smooth the data, obscuring important details and introducing bias.
Kernel Choice:
Gaussian kernels provide smooth density estimates and are theoretically well-founded
Box kernels are computationally efficient and provide non-parametric estimates
Computational Complexity: The algorithm has O(N²) complexity for box kernels using KDTree queries, and varies for Gaussian kernels depending on the implementation.
Cross-entropy: Supported between two distributions by evaluating the density of the second distribution at points from the first distribution.
Examples
>>> import infomeasure as im >>> from numpy.random import default_rng >>> rng = default_rng(281769) >>> # Generate sample data >>> data = rng.normal(0, 1, 1000) >>> >>> # Create estimator >>> estimator = im.estimator(data, measure="h", approach="kernel", bandwidth=0.5, kernel='gaussian') >>> >>> # Calculate entropy >>> estimator.result() np.float64(1.366015332652949) >>> # Local values >>> estimator.local_vals() array([1.54017083, 1.35855839, 0.97949819, 0.97333173, 2.62084886, ... 1.08174049, 0.97418054, 1.88055967, 0.99614516, 0.98548583])
infomeasure.estimators.entropy.kozachenko_leonenko module#
Module for the Kozachenko-Leonenko entropy estimator.
- class infomeasure.estimators.entropy.kozachenko_leonenko.KozachenkoLeonenkoEntropyEstimator(*data, k: int = 4, ksg_id: int = 1, noise_level=1e-10, minkowski_p=inf, base: int | float | str = 'e')[source]#
Bases:
RandomGeneratorMixin,EntropyEstimatorKozachenko-Leonenko entropy estimator for continuous data.
The Kozachenko-Leonenko estimator computes the Shannon entropy of continuous data using nearest neighbor distances. The estimator is based on the method from [KL87] and follows the implementation approach described in [KSG11].
\[\hat{H}_{KL} = -\psi(k) + \psi(N) + \log(c_d) + \frac{d}{N} \sum_{i=1}^{N} \log(2\rho_{k,i})\]where \(\psi\) is the digamma function, \(k\) is the number of nearest neighbors, \(N\) is the number of data points, \(d\) is the dimensionality, \(c_d\) is the volume of the \(d\)-dimensional unit ball for the chosen Minkowski norm, and \(\rho_{k,i}\) is the distance to the \(k\)-th nearest neighbor of point \(i\).
This estimator is particularly suitable for continuous multivariate data and provides asymptotically unbiased estimates of differential entropy. The method works by exploiting the relationship between nearest neighbor distances and local density, making it effective for high-dimensional data where traditional histogram-based methods fail.
- Parameters:
- *dataarray_like
The continuous data used to estimate the entropy. For multivariate data, each variable should be a column.
- k
int, default=4 The number of nearest neighbors to consider. Higher values provide more stable estimates but may introduce bias. The default value of 4 is recommended by [KSG11].
- noise_level
float, default=1e-10 The standard deviation of Gaussian noise added to the data to avoid issues with zero distances between identical points. Set to 0 to disable noise addition.
- minkowski_p
float, default=inf The power parameter for the Minkowski metric used in distance calculations. Common values are 2 (Euclidean distance) and inf (maximum norm/Chebyshev distance). Must satisfy \(1 \leq p \leq \infty\).
- ksg_id
int, default=1 The KSG estimator variant to use (1 or 2). Type I uses the standard formula. Type II uses a modified formula with \(\psi(k) - 1/k\).
- base
LogBaseType, default=Config.get(“base”) The logarithm base for entropy calculation. Can be 2, 10, “e”, or any positive number.
- Attributes:
- *data
tuple[array_like] The processed data used to estimate the entropy, converted to 2D arrays.
- k
int The number of nearest neighbors to consider.
- noise_level
float The standard deviation of the Gaussian noise added to the data.
- minkowski_p
float The power parameter for the Minkowski metric.
- ksg_id
int The KSG estimator variant to use.
- *data
- Raises:
ValueErrorIf the number of nearest neighbors is not a positive integer.
ValueErrorIf the noise level is negative.
ValueErrorIf the Minkowski power parameter is invalid (not in range [1, ∞]).
Notes
The choice of the number of nearest neighbors \(k\) affects the bias-variance tradeoff of the estimator. Smaller values of \(k\) reduce bias but increase variance, while larger values have the opposite effect. The default value of \(k=4\) provides a good balance for most applications.
The noise addition helps handle datasets with repeated values or points that are exactly identical, which would otherwise result in zero distances and numerical issues. The noise level should be small enough not to significantly alter the underlying distribution.
For high-dimensional data, the curse of dimensionality may affect the estimator’s performance, as nearest neighbor distances become less informative. In such cases, dimensionality reduction or alternative entropy estimation methods may be preferable.
Examples
>>> import numpy as np >>> import infomeasure as im >>> >>> # Generate 2D Gaussian data >>> np.random.seed(176250) >>> data = np.random.multivariate_normal([0, 0], [[1, 0.5], [0.5, 1]], 1000) >>> >>> # Estimate entropy >>> estimator = im.estimator(data, measure="h", approach="kl", k=4) >>> entropy_value = estimator.result() >>> print(f"Estimated entropy: {entropy_value:.3f}") Estimated entropy: 2.678 >>> print(f"Local values: {estimator.local_vals()}") Local values: [ 3.15330798 2.02688591 2.52250064 2.95236651 3.58801879 1.42033673 ... 2.91254223 1.92823136 3.63647704 2.05589055] >>> # Use different distance metric >>> estimator_euclidean = KozachenkoLeonenkoEntropyEstimator(data, k=4, minkowski_p=2) >>> entropy_euclidean = estimator_euclidean.entropy() np.float64(2.6772465397252208)
infomeasure.estimators.entropy.miller_madow module#
Module for the discrete Miller-Madow entropy estimator.
- class infomeasure.estimators.entropy.miller_madow.MillerMadowEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorDiscrete Miller-Madow entropy estimator.
\[\hat{H}_{\tiny{MM}} = \hat{H}_{\tiny{MLE}} + \frac{K - 1}{2N}\]\(\hat{H}_{\tiny{MM}}\) is the Miller-Madow entropy, where \(\hat{H}_{\tiny{MLE}}\) is the maximum likelihood entropy (
DiscreteEntropyEstimator). \(K\) is the number of unique values in the data, and \(N\) is the number of observations.- Attributes:
- *dataarray_like
The data used to estimate the entropy.
infomeasure.estimators.entropy.nsb module#
Module for the NSB (Nemenman-Shafee-Bialek) entropy estimator.
- class infomeasure.estimators.entropy.nsb.NsbEntropyEstimator(*data, K: int = None, base: int | float | str = 'e')[source]#
Bases:
DiscreteHEstimatorNSB (Nemenman-Shafee-Bialek) entropy estimator.
The NSB estimator provides a Bayesian estimate of Shannon entropy for discrete data using the Nemenman, Shafee, Bialek algorithm. This estimator is particularly effective for undersampled data where traditional estimators may be biased.
The NSB estimate is computed as:
\[\hat{H}^{\text{NSB}} = \frac{ \int_0^{\ln(K)} d\xi \, \rho(\xi, \textbf{n}) \langle H^m \rangle_{\beta (\xi)} } { \int_0^{\ln(K)} d\xi \, \rho(\xi\mid \textbf{n})}\]where
\[\rho(\xi \mid \textbf{n}) = \mathcal{P}(\beta (\xi)) \frac{ \Gamma(\kappa(\xi))}{\Gamma(N + \kappa(\xi))} \prod_{i=1}^K \frac{\Gamma(n_i + \beta(\xi))}{\Gamma(\beta(\xi))}\]The algorithm uses numerical integration to compute the Bayesian posterior over possible entropy values, providing a principled approach to entropy estimation that accounts for sampling uncertainty [NSB02].
If there are no coincidences in the data (all observations are unique), NSB returns NaN as the estimator requires repeated observations to function properly.
- Parameters:
- *dataarray_like
The data used to estimate the entropy.
- K
int,optional The support size. If not provided, uses the observed support size.
- base
LogBaseType, default=Config.get(“base”) The logarithm base for entropy calculation.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
Notes
The NSB estimator is computationally intensive as it requires numerical integration and optimisation. For large datasets or when computational efficiency is critical, consider using the asymptotic NSB (ANSB) estimator
AnsbEntropyEstimatorinstead.The estimator assumes a uniform prior over the space of possible probability distributions and uses Bayesian inference to estimate the entropy.
Examples
>>> import infomeasure as im >>> data = [1, 2, 3, 4, 5, 1, 2] # Some repeated values >>> im.entropy(data, approach='nsb') np.float64(1.4526460202102247)
infomeasure.estimators.entropy.ordinal module#
Module for the Ordinal / Permutation entropy estimator.
- class infomeasure.estimators.entropy.ordinal.OrdinalEntropyEstimator(*data, embedding_dim: int, step_size: int = 1, stable: bool = False, base: int | float | str = 'e')[source]#
Bases:
EntropyEstimatorEstimator for the Ordinal / Permutation entropy.
The Ordinal entropy is a measure of the complexity of a time series. The input data needs to be comparable, i.e., the data should be ordinal, as the relative frequencies are calculated. For a given
embedding_dim(length of considered subsequences), all \(n!\) possible permutations are considered and their relative frequencies are calculated [BP02].Embedding delay is not supported natively.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- embedding_dim
int The size of the permutation patterns.
- step_size
int,optional The step size for the sliding windows (delay). Default is 1.
- stablebool,
optional If True, when sorting the data, the embedding_dim of equal elements is preserved. This can be useful for reproducibility and testing, but might be slower.
- Raises:
ValueErrorIf the
embedding_dimis negative or not an integer.ValueErrorIf the
embedding_dimis too large for the given data.TypeErrorIf the data are not 1d array-like(s).
Notes
The ordinality will be determined via
numpy.argsort().If
embedding_dimis set to 1, the entropy is always 0.
infomeasure.estimators.entropy.renyi module#
Module for the Rényi entropy estimator.
- class infomeasure.estimators.entropy.renyi.RenyiEntropyEstimator(*data, k: int = 4, alpha: float | int = None, base: int | float | str = 'e')[source]#
Bases:
EntropyEstimatorEstimator for the Rényi entropy.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- k
int The number of nearest neighbors used in the estimation.
- alpha
float|int The Rényi parameter, order or exponent. Sometimes denoted as \(\alpha\) or \(q\).
- Raises:
ValueErrorIf the Renyi parameter is not a positive number.
ValueErrorIf the number of nearest neighbors is not a positive integer.
Notes
The Rényi entropy is a generalization of Shannon entropy, where the small values of probabilities are emphasized for \(\alpha < 1\), and higher probabilities are emphasized for \(\alpha > 1\). For \(\alpha = 1\), it reduces to Shannon entropy. The Rényi-Entropy class can be particularly interesting for systems where additivity (in Shannon sense) is not always preserved, especially in nonlinear complex systems, such as when dealing with long-range forces.
infomeasure.estimators.entropy.shrink module#
Module for the shrink (James-Stein) entropy estimator.
- class infomeasure.estimators.entropy.shrink.ShrinkEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorShrinkage (James-Stein) entropy estimator.
This estimator applies James-Stein shrinkage to the probability estimates before computing entropy, which can reduce bias in small sample scenarios. The shrinkage probabilities are calculated as:
\[\hat{p}_x^{\text{SHR}} = \lambda t_x + (1 - \lambda) \hat{p}_x^{\text{ML}}\]where \(\hat{p}_x^{\text{ML}}\) are the maximum likelihood probability estimates, \(t_x = 1/K\) is the uniform target distribution, and the shrinkage parameter \(\lambda\) is given by:
\[\lambda = \frac{ 1 - \sum_{x=1}^{K} (\hat{p}_x^{\text{SHR}})^2}{(n-1) \sum_{x=1}^K (t_x - \hat{p}_x^{\text{ML}})^2}\]The entropy is then computed using these shrinkage-corrected probabilities.
Based on the implementation in the R package entropy [HS09].
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- property dist_dict#
Dictionary of shrinkage probabilities for each unique value. Used by JSD.
infomeasure.estimators.entropy.tsallis module#
Module for Tsallis entropy estimator.
- class infomeasure.estimators.entropy.tsallis.TsallisEntropyEstimator(*data, k: int = 4, q: float | int = None, base: int | float | str = 'e')[source]#
Bases:
EntropyEstimatorEstimator for the Tsallis entropy.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- k
int The number of nearest neighbors used in the estimation.
- q
float The Tsallis parameter, order or exponent. Sometimes denoted as \(q\), analogous to the Rényi parameter \(\alpha\).
- Raises:
ValueErrorIf the Tsallis parameter is not a positive number.
ValueErrorIf the number of nearest neighbors is not a positive integer.
Notes
In the \(q \to 1\) limit, the Jackson sum (q-additivity) reduces to ordinary summation, and the Tallis entropy reduces to Shannon Entropy. This class of entropy measure is in particularly useful in the study in connection with long-range correlated systems and with non-equilibrium phenomena.
infomeasure.estimators.entropy.zhang module#
Module for the Zhang entropy estimator.
- class infomeasure.estimators.entropy.zhang.ZhangEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorZhang entropy estimator for discrete data.
The Zhang estimator computes the Shannon entropy using the recommended definition from [GZZ13]:
\[\hat{H}_Z = \sum_{i=1}^K \hat{p}_i \sum_{v=1}^{N - n_i} \frac{1}{v} \prod_{j=0}^{v-1} \left( 1 + \frac{1 - n_i}{N - 1 - j} \right)\]where \(\hat{p}_i\) are the empirical probabilities, \(n_i\) are the counts for each unique value, \(K\) is the number of unique values, and \(N\) is the total number of observations.
The actual algorithm implementation follows the fast calculation approach from [LCBFiC17].
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
Module contents#
Entropy estimators.
- class infomeasure.estimators.entropy.AnsbEntropyEstimator(*data, K: int = None, undersampled: float = 0.1, base: int | float | str = 'e')[source]#
Bases:
DiscreteHEstimatorAsymptotic NSB entropy estimator.
The Asymptotic NSB (ANSB) estimator provides entropy estimation for extremely undersampled discrete data where the number of unique values K is comparable to the sample size N.
\[\hat{H}_{\text{ANSB}} = (C_\gamma - \log(2)) + 2 \log(N) - \psi(\Delta)\]where \(C_\gamma \approx 0.5772156649\dots\) is Euler’s constant, \(\psi\) is the digamma function, and \(\Delta = N - K\) is the number of coincidences (repeated observations) in the data.
This estimator is specifically designed for the extremely undersampled regime where \(K \sim N\) and diverges with N when the data is well-sampled. The ANSB estimator requires that \(N/K \to 0\), which is checked by default using the
undersampledparameter [NBvS04].If there are no coincidences in the data (\(\Delta = 0\)), ANSB returns NaN as the estimator is undefined in this case.
- Parameters:
- *dataarray_like
The data used to estimate the entropy.
- K
int,optional The support size. If not provided, uses the observed support size.
- undersampled
float, default=0.1 Maximum allowed ratio N/K to consider data sufficiently undersampled. A warning is issued if this threshold is exceeded.
- base
LogBaseType, default=Config.get(“base”) The logarithm base for entropy calculation.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
Notes
The ANSB estimator is based on the asymptotic expansion of the NSB estimator for the case of extreme undersampling. It provides a computationally efficient alternative to the full NSB estimator when \(K \sim N\).
Examples
>>> import infomeasure as im >>> data = [1, 2, 3, 4, 5, 1, 2] # Some repeated values >>> im.entropy(data, approach='ansb') np.float64(3.353104447353747)
- class infomeasure.estimators.entropy.BayesEntropyEstimator(*data, alpha: float | str, K: int = None, base: int | float | str = 'e')[source]#
Bases:
DiscreteHEstimatorBayesian entropy estimator.
Computes an estimate of Shannon entropy using Bayesian probability estimates with a Dirichlet prior characterized by concentration parameter α. This approach provides a principled way to handle sparse data and incorporate prior knowledge about the probability distribution.
The Bayesian probabilities are calculated as:
\[p_k^{\text{Bayes}} = \frac{n_k + \alpha}{N + K \alpha}\]where \(n_k\) is the count of symbol \(k\), \(N\) is the total number of observations, \(K\) is the support size (number of unique symbols), and \(\alpha\) is the concentration parameter of the Dirichlet prior.
The entropy is then \(-\sum p_k^{\text{Bayes}} \log p_k^{\text{Bayes}}\), same as the maximum likelihood entropy estimator, also supporting local entropy values.
Concentration Parameter Choices
The concentration parameter α controls the strength of the prior belief in uniform distribution. Several well-established choices are available:
- Jeffreys Prior (
α = 0.5 = "jeffrey") Non-informative prior that is invariant under reparameterization. Provides good performance for most applications [KT81].
- Laplace Prior (
α = 1.0 = "laplace") Uniform prior that adds one pseudocount to each symbol [BP63]. Simple and widely used, equivalent to add-one smoothing.
- Schürmann-Grassberger Prior (
α = 1/K = "sch-grass") Adaptive prior that scales with the alphabet size. Particularly effective for large alphabets.
- Minimax Prior (
α = √N/K = "min-max") Minimises the maximum expected loss. Balances between sample size and alphabet size.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- alpha
float The concentration parameter α of the Dirichlet prior.
- K
int,optional The support size. If not provided, uses the observed support size.
- property bayes_probs#
- property dist_dict#
Return the Bayesian distribution dictionary for JSD.
- Jeffreys Prior (
- class infomeasure.estimators.entropy.BonachelaEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorBonachela (Bonachela-Hinrichsen-Muñoz) entropy estimator for discrete data.
The Bonachela estimator computes the Shannon entropy using the formula from [BHM08]:
\[\hat{H}_{B} = \frac{1}{N+2} \sum_{i=1}^{K} \left( (n_i + 1) \sum_{j=n_i + 2}^{N+2} \frac{1}{j} \right)\]where \(n_i\) are the counts for each unique value, \(K\) is the number of unique values, and \(N\) is the total number of observations.
This estimator is specially designed to provide a compromise between low bias and small statistical errors for short data series, particularly when the data sets are small and the probabilities are not close to zero.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- class infomeasure.estimators.entropy.ChaoShenEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorChao-Shen entropy estimator.
\[\hat{H}_{CS} = - \sum_{i=1}^{K} \frac{\hat{p}_i^{CS} \log \hat{p}_i^{CS}}{1 - (1 - \hat{p}_i^{ML} C)^N}\]where
\[\hat{p}_i^{CS} = C \cdot \hat{p}_i^{ML}\]and \(C = 1 - \frac{f_1}{N}\) is the estimated coverage, \(f_1\) is the number of singletons (species observed exactly once), \(\hat{p}_i^{ML}\) is the maximum likelihood probability estimate, \(N\) is the sample size, and \(K\) is the number of observed species [CS03]. The Chao-Shen estimator provides a bias-corrected estimate of Shannon entropy that accounts for unobserved species through coverage estimation.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- class infomeasure.estimators.entropy.ChaoWangJostEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorAdvanced bias-corrected Shannon entropy estimator using coverage estimation.
The Chao-Wang-Jost estimator provides improved entropy estimates for incomplete sampling scenarios by accounting for unobserved species through sophisticated statistical corrections. This estimator is particularly valuable when dealing with ecological data, text analysis, or any discrete distribution where the sample may not capture all possible outcomes.
The Chao-Wang-Jost estimator addresses the systematic underestimation of entropy in finite samples by applying sophisticated statistical corrections. Through coverage estimation using singleton and doubleton counts, it provides reliable entropy estimates even with small or incomplete samples. Based on species accumulation theory and Good-Turing estimation principles, this approach is particularly valuable when the sample doesn’t capture all possible outcomes, such as in ecological diversity studies with incomplete species sampling or text analysis where vocabulary may be incompletely observed. The estimator is especially useful when standard entropy estimators show systematic bias due to sample size limitations.
Standard entropy estimators often underestimate diversity in finite samples, especially when the sampling is incomplete. This estimator overcomes this limitation by leveraging information from rare species (singletons and doubletons) to estimate sample coverage and correct for unobserved species. The theoretical foundation in species accumulation curves and Good-Turing frequency estimation provide a robust statistical framework for addressing sampling bias issues.
Mathematical Foundation:
The estimator combines observed entropy with a correction term based on coverage estimation:
\[\hat{H}_{\text{CWJ}} = \sum_{1 \leq n_i \leq N-1} \frac{n_i}{N} \left(\sum_{k=n_i}^{N-1} \frac{1}{k} \right) + \frac{f_1}{N} (1 - A)^{-N + 1} \left\{ - \log(A) - \sum_{r=1}^{N-1} \frac{1}{r} (1 - A)^r \right\}\]where the coverage parameter \(A\) is estimated as:
\[\begin{split}A = \begin{cases} \frac{2 f_2}{(N-1) f_1 + 2 f_2} \, & \text{if} \, f_2 > 0 \\ \frac{2}{(N-1)(f_1 - 1) + 2} \, & \text{if} \, f_2 = 0, \; f_1 \neq 0 \\ 1, & \text{if} \, f_1 = f_2 = 0 \end{cases}\end{split}\]Here, \(f_1\) represents the number of singletons (species observed exactly once) and \(f_2\) the number of doubletons (species observed exactly twice) in the sample [CWJ13].
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
See also
infomeasure.estimators.functional.entropyFunctional interface for entropy calculation
infomeasure.estimators.entropy.discrete.DiscreteEntropyEstimatorStandard maximum likelihood entropy estimator
Notes
The algorithm is adapted from the entropart R library [MH15]
The correction becomes negligible when samples are complete (\(f_1 = f_2 = 0\))
Examples
>>> import infomeasure as im >>> >>> # Basic usage with incomplete sampling scenario >>> data = [1, 1, 2, 3, 4, 5] # Many singletons suggest incomplete sampling >>> h_cwj = im.entropy(data, approach="chao_wang_jost", base=2) >>> h_standard = im.entropy(data, approach="discrete", base=2) >>> print(f"Chao-Wang-Jost: {h_cwj:.3f} bits") Chao-Wang-Jost: 3.635 bits >>> print(f"Standard: {h_standard:.3f} bits") Standard: 2.252 bits >>> >>> # Ecological diversity example >>> species_counts = [1, 1, 1, 2, 2, 3, 5, 8] # Species abundance data >>> diversity = im.entropy(species_counts, approach="cwj", base="e") >>> print(f"Species diversity: {diversity:.3f} nats") Species diversity: 2.054 nats
- class infomeasure.estimators.entropy.DiscreteEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorStandard Shannon entropy estimator for discrete data using maximum likelihood.
The discrete entropy estimator computes the Shannon entropy using the classical maximum likelihood approach:
\[\hat{H} = -\sum_{i=1}^{K} \hat{p}_i \log \hat{p}_i\]where \(\hat{p}_i = \frac{n_i}{N}\) are the empirical probabilities, \(n_i\) are the counts for each unique value \(i\), \(K\) is the number of unique values, and \(N\) is the total number of observations.
This is the most fundamental entropy estimator and serves as the baseline for comparison with other bias-corrected estimators. While it provides an asymptotically unbiased estimate of the true entropy, it can exhibit significant bias for small sample sizes, particularly when the number of unique values is large relative to the sample size.
The estimator is suitable for:
Large datasets where bias is minimal
Baseline comparisons with bias-corrected estimators
Applications where computational simplicity is preferred
Well-sampled distributions with sufficient observations per unique value
For small sample sizes or distributions with many rare events, consider using bias-corrected estimators such as
ChaoShenEntropyEstimator,BonachelaEntropyEstimator, orZhangEntropyEstimator.- Attributes:
- *dataarray_like
The data used to estimate the entropy. For joint entropy, multiple arrays can be provided.
- base
floatorstr, default=Config.get(“base”) The logarithm base for entropy calculation. Common values are 2 (bits), 10 (dits), or ‘e’ (nats).
Examples
>>> import infomeasure as im >>> # Simple entropy calculation >>> data = [1, 1, 2, 3, 3, 4, 5] >>> entropy_value = im.entropy(data, approach="discrete") >>> print(f"Entropy: {entropy_value:.3f} nats") Entropy: 1.550 nats >>> # Local values >>> estimator = im.estimator(data, measure="h", approach="discrete") >>> estimator.local_vals() array([1.25276297, 1.25276297, 1.94591015, 1.25276297, 1.25276297, 1.94591015, 1.94591015])
- property dist_dict#
Return the distribution dictionary for JSD.
- class infomeasure.estimators.entropy.GrassbergerEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorDiscrete Grassberger entropy estimator.
\[\hat{H}_{\text{Gr88}} = \sum_i \frac{n_i}{H} \left(\log(N) - \psi(n_i) - \frac{(-1)^{n_i}}{n_i + 1} \right)\]\(\hat{H}_{\text{Gr88}}\) is the Grassberger entropy, where \(n_i\) are the counts, \(H\) is the total number of observations \(N\), and \(\psi\) is the digamma function [Gra08, Gra88].
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- class infomeasure.estimators.entropy.KernelEntropyEstimator(*data, bandwidth: float | int, kernel: str, workers: int = 1, base: int | float | str = 'e')[source]#
Bases:
WorkersMixin,EntropyEstimatorKernel entropy estimator for continuous data using Kernel Density Estimation (KDE).
The kernel entropy estimator computes the differential Shannon entropy by estimating the probability density function using kernel density estimation:
\[\hat{H}(X) = -\int \hat{f}(x) \log \hat{f}(x) \, dx \approx -\frac{1}{N} \sum_{i=1}^{N} \log \hat{f}(x_i)\]where \(\hat{f}(x)\) is the kernel density estimate:
\[\hat{f}(x) = \frac{1}{N h^d} \sum_{i=1}^{N} K\left(\frac{x - x_i}{h}\right)\]with \(K(\cdot)\) being the kernel function, \(h\) the bandwidth parameter, \(d\) the dimensionality, and \(N\) the number of data points.
For joint entropy of multiple variables, the estimator concatenates the variables into a single multivariate space and applies the same KDE approach.
The estimator supports both Gaussian and box (uniform) kernels. The choice of bandwidth is critical: small values can lead to under-smoothing and overfitting, while large values may over-smooth the data and obscure important features [GP25, Sil86].
- Parameters:
- *dataarray_like
The continuous data used to estimate the entropy. For univariate entropy, pass a single array. For joint entropy, pass multiple arrays.
- bandwidth
float|int The bandwidth parameter for the kernel. Controls the smoothness of the density estimate.
- kernel
str Type of kernel to use. Supported options are:
'gaussian': Gaussian (normal) kernel'box': Box (uniform) kernel
Compatible with the KDE implementation
kde_probability_density_function().- workers
int,optional Number of workers to use for parallel processing. Default is 1 (no parallelization). If set to -1, all available CPU cores will be used.
- base
float|str,optional Logarithm base for entropy calculation. Default is from global configuration.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- bandwidth
float|int The bandwidth for the kernel.
- kernel
str Type of kernel to use.
- workers
int Number of workers to use for parallel processing.
- Returns:
- array_like
Local entropy values for each data point when calling entropy calculation methods. The mean of these values gives the overall entropy estimate.
See also
infomeasure.estimators.utils.kde.kde_probability_density_functionUnderlying KDE implementation
infomeasure.estimators.entropy.discrete.DiscreteEntropyEstimatorFor discrete data entropy estimation
Notes
Bandwidth Selection: The bandwidth parameter critically affects the quality of the entropy estimate. A small bandwidth can lead to under-sampling and high variance, while a large bandwidth may over-smooth the data, obscuring important details and introducing bias.
Kernel Choice:
Gaussian kernels provide smooth density estimates and are theoretically well-founded
Box kernels are computationally efficient and provide non-parametric estimates
Computational Complexity: The algorithm has O(N²) complexity for box kernels using KDTree queries, and varies for Gaussian kernels depending on the implementation.
Cross-entropy: Supported between two distributions by evaluating the density of the second distribution at points from the first distribution.
Examples
>>> import infomeasure as im >>> from numpy.random import default_rng >>> rng = default_rng(281769) >>> # Generate sample data >>> data = rng.normal(0, 1, 1000) >>> >>> # Create estimator >>> estimator = im.estimator(data, measure="h", approach="kernel", bandwidth=0.5, kernel='gaussian') >>> >>> # Calculate entropy >>> estimator.result() np.float64(1.366015332652949) >>> # Local values >>> estimator.local_vals() array([1.54017083, 1.35855839, 0.97949819, 0.97333173, 2.62084886, ... 1.08174049, 0.97418054, 1.88055967, 0.99614516, 0.98548583])
- class infomeasure.estimators.entropy.KozachenkoLeonenkoEntropyEstimator(*data, k: int = 4, ksg_id: int = 1, noise_level=1e-10, minkowski_p=inf, base: int | float | str = 'e')[source]#
Bases:
RandomGeneratorMixin,EntropyEstimatorKozachenko-Leonenko entropy estimator for continuous data.
The Kozachenko-Leonenko estimator computes the Shannon entropy of continuous data using nearest neighbor distances. The estimator is based on the method from [KL87] and follows the implementation approach described in [KSG11].
\[\hat{H}_{KL} = -\psi(k) + \psi(N) + \log(c_d) + \frac{d}{N} \sum_{i=1}^{N} \log(2\rho_{k,i})\]where \(\psi\) is the digamma function, \(k\) is the number of nearest neighbors, \(N\) is the number of data points, \(d\) is the dimensionality, \(c_d\) is the volume of the \(d\)-dimensional unit ball for the chosen Minkowski norm, and \(\rho_{k,i}\) is the distance to the \(k\)-th nearest neighbor of point \(i\).
This estimator is particularly suitable for continuous multivariate data and provides asymptotically unbiased estimates of differential entropy. The method works by exploiting the relationship between nearest neighbor distances and local density, making it effective for high-dimensional data where traditional histogram-based methods fail.
- Parameters:
- *dataarray_like
The continuous data used to estimate the entropy. For multivariate data, each variable should be a column.
- k
int, default=4 The number of nearest neighbors to consider. Higher values provide more stable estimates but may introduce bias. The default value of 4 is recommended by [KSG11].
- noise_level
float, default=1e-10 The standard deviation of Gaussian noise added to the data to avoid issues with zero distances between identical points. Set to 0 to disable noise addition.
- minkowski_p
float, default=inf The power parameter for the Minkowski metric used in distance calculations. Common values are 2 (Euclidean distance) and inf (maximum norm/Chebyshev distance). Must satisfy \(1 \leq p \leq \infty\).
- ksg_id
int, default=1 The KSG estimator variant to use (1 or 2). Type I uses the standard formula. Type II uses a modified formula with \(\psi(k) - 1/k\).
- base
LogBaseType, default=Config.get(“base”) The logarithm base for entropy calculation. Can be 2, 10, “e”, or any positive number.
- Attributes:
- *data
tuple[array_like] The processed data used to estimate the entropy, converted to 2D arrays.
- k
int The number of nearest neighbors to consider.
- noise_level
float The standard deviation of the Gaussian noise added to the data.
- minkowski_p
float The power parameter for the Minkowski metric.
- ksg_id
int The KSG estimator variant to use.
- *data
- Raises:
ValueErrorIf the number of nearest neighbors is not a positive integer.
ValueErrorIf the noise level is negative.
ValueErrorIf the Minkowski power parameter is invalid (not in range [1, ∞]).
Notes
The choice of the number of nearest neighbors \(k\) affects the bias-variance tradeoff of the estimator. Smaller values of \(k\) reduce bias but increase variance, while larger values have the opposite effect. The default value of \(k=4\) provides a good balance for most applications.
The noise addition helps handle datasets with repeated values or points that are exactly identical, which would otherwise result in zero distances and numerical issues. The noise level should be small enough not to significantly alter the underlying distribution.
For high-dimensional data, the curse of dimensionality may affect the estimator’s performance, as nearest neighbor distances become less informative. In such cases, dimensionality reduction or alternative entropy estimation methods may be preferable.
Examples
>>> import numpy as np >>> import infomeasure as im >>> >>> # Generate 2D Gaussian data >>> np.random.seed(176250) >>> data = np.random.multivariate_normal([0, 0], [[1, 0.5], [0.5, 1]], 1000) >>> >>> # Estimate entropy >>> estimator = im.estimator(data, measure="h", approach="kl", k=4) >>> entropy_value = estimator.result() >>> print(f"Estimated entropy: {entropy_value:.3f}") Estimated entropy: 2.678 >>> print(f"Local values: {estimator.local_vals()}") Local values: [ 3.15330798 2.02688591 2.52250064 2.95236651 3.58801879 1.42033673 ... 2.91254223 1.92823136 3.63647704 2.05589055] >>> # Use different distance metric >>> estimator_euclidean = KozachenkoLeonenkoEntropyEstimator(data, k=4, minkowski_p=2) >>> entropy_euclidean = estimator_euclidean.entropy() np.float64(2.6772465397252208)
- class infomeasure.estimators.entropy.MillerMadowEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorDiscrete Miller-Madow entropy estimator.
\[\hat{H}_{\tiny{MM}} = \hat{H}_{\tiny{MLE}} + \frac{K - 1}{2N}\]\(\hat{H}_{\tiny{MM}}\) is the Miller-Madow entropy, where \(\hat{H}_{\tiny{MLE}}\) is the maximum likelihood entropy (
DiscreteEntropyEstimator). \(K\) is the number of unique values in the data, and \(N\) is the number of observations.- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- class infomeasure.estimators.entropy.NsbEntropyEstimator(*data, K: int = None, base: int | float | str = 'e')[source]#
Bases:
DiscreteHEstimatorNSB (Nemenman-Shafee-Bialek) entropy estimator.
The NSB estimator provides a Bayesian estimate of Shannon entropy for discrete data using the Nemenman, Shafee, Bialek algorithm. This estimator is particularly effective for undersampled data where traditional estimators may be biased.
The NSB estimate is computed as:
\[\hat{H}^{\text{NSB}} = \frac{ \int_0^{\ln(K)} d\xi \, \rho(\xi, \textbf{n}) \langle H^m \rangle_{\beta (\xi)} } { \int_0^{\ln(K)} d\xi \, \rho(\xi\mid \textbf{n})}\]where
\[\rho(\xi \mid \textbf{n}) = \mathcal{P}(\beta (\xi)) \frac{ \Gamma(\kappa(\xi))}{\Gamma(N + \kappa(\xi))} \prod_{i=1}^K \frac{\Gamma(n_i + \beta(\xi))}{\Gamma(\beta(\xi))}\]The algorithm uses numerical integration to compute the Bayesian posterior over possible entropy values, providing a principled approach to entropy estimation that accounts for sampling uncertainty [NSB02].
If there are no coincidences in the data (all observations are unique), NSB returns NaN as the estimator requires repeated observations to function properly.
- Parameters:
- *dataarray_like
The data used to estimate the entropy.
- K
int,optional The support size. If not provided, uses the observed support size.
- base
LogBaseType, default=Config.get(“base”) The logarithm base for entropy calculation.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
Notes
The NSB estimator is computationally intensive as it requires numerical integration and optimisation. For large datasets or when computational efficiency is critical, consider using the asymptotic NSB (ANSB) estimator
AnsbEntropyEstimatorinstead.The estimator assumes a uniform prior over the space of possible probability distributions and uses Bayesian inference to estimate the entropy.
Examples
>>> import infomeasure as im >>> data = [1, 2, 3, 4, 5, 1, 2] # Some repeated values >>> im.entropy(data, approach='nsb') np.float64(1.4526460202102247)
- class infomeasure.estimators.entropy.OrdinalEntropyEstimator(*data, embedding_dim: int, step_size: int = 1, stable: bool = False, base: int | float | str = 'e')[source]#
Bases:
EntropyEstimatorEstimator for the Ordinal / Permutation entropy.
The Ordinal entropy is a measure of the complexity of a time series. The input data needs to be comparable, i.e., the data should be ordinal, as the relative frequencies are calculated. For a given
embedding_dim(length of considered subsequences), all \(n!\) possible permutations are considered and their relative frequencies are calculated [BP02].Embedding delay is not supported natively.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- embedding_dim
int The size of the permutation patterns.
- step_size
int,optional The step size for the sliding windows (delay). Default is 1.
- stablebool,
optional If True, when sorting the data, the embedding_dim of equal elements is preserved. This can be useful for reproducibility and testing, but might be slower.
- Raises:
ValueErrorIf the
embedding_dimis negative or not an integer.ValueErrorIf the
embedding_dimis too large for the given data.TypeErrorIf the data are not 1d array-like(s).
Notes
The ordinality will be determined via
numpy.argsort().If
embedding_dimis set to 1, the entropy is always 0.
- class infomeasure.estimators.entropy.RenyiEntropyEstimator(*data, k: int = 4, alpha: float | int = None, base: int | float | str = 'e')[source]#
Bases:
EntropyEstimatorEstimator for the Rényi entropy.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- k
int The number of nearest neighbors used in the estimation.
- alpha
float|int The Rényi parameter, order or exponent. Sometimes denoted as \(\alpha\) or \(q\).
- Raises:
ValueErrorIf the Renyi parameter is not a positive number.
ValueErrorIf the number of nearest neighbors is not a positive integer.
Notes
The Rényi entropy is a generalization of Shannon entropy, where the small values of probabilities are emphasized for \(\alpha < 1\), and higher probabilities are emphasized for \(\alpha > 1\). For \(\alpha = 1\), it reduces to Shannon entropy. The Rényi-Entropy class can be particularly interesting for systems where additivity (in Shannon sense) is not always preserved, especially in nonlinear complex systems, such as when dealing with long-range forces.
- class infomeasure.estimators.entropy.ShrinkEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorShrinkage (James-Stein) entropy estimator.
This estimator applies James-Stein shrinkage to the probability estimates before computing entropy, which can reduce bias in small sample scenarios. The shrinkage probabilities are calculated as:
\[\hat{p}_x^{\text{SHR}} = \lambda t_x + (1 - \lambda) \hat{p}_x^{\text{ML}}\]where \(\hat{p}_x^{\text{ML}}\) are the maximum likelihood probability estimates, \(t_x = 1/K\) is the uniform target distribution, and the shrinkage parameter \(\lambda\) is given by:
\[\lambda = \frac{ 1 - \sum_{x=1}^{K} (\hat{p}_x^{\text{SHR}})^2}{(n-1) \sum_{x=1}^K (t_x - \hat{p}_x^{\text{ML}})^2}\]The entropy is then computed using these shrinkage-corrected probabilities.
Based on the implementation in the R package entropy [HS09].
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- property dist_dict#
Dictionary of shrinkage probabilities for each unique value. Used by JSD.
- class infomeasure.estimators.entropy.TsallisEntropyEstimator(*data, k: int = 4, q: float | int = None, base: int | float | str = 'e')[source]#
Bases:
EntropyEstimatorEstimator for the Tsallis entropy.
- Attributes:
- *dataarray_like
The data used to estimate the entropy.
- k
int The number of nearest neighbors used in the estimation.
- q
float The Tsallis parameter, order or exponent. Sometimes denoted as \(q\), analogous to the Rényi parameter \(\alpha\).
- Raises:
ValueErrorIf the Tsallis parameter is not a positive number.
ValueErrorIf the number of nearest neighbors is not a positive integer.
Notes
In the \(q \to 1\) limit, the Jackson sum (q-additivity) reduces to ordinary summation, and the Tallis entropy reduces to Shannon Entropy. This class of entropy measure is in particularly useful in the study in connection with long-range correlated systems and with non-equilibrium phenomena.
- class infomeasure.estimators.entropy.ZhangEntropyEstimator(*args, **kwargs)[source]#
Bases:
DiscreteHEstimatorZhang entropy estimator for discrete data.
The Zhang estimator computes the Shannon entropy using the recommended definition from [GZZ13]:
\[\hat{H}_Z = \sum_{i=1}^K \hat{p}_i \sum_{v=1}^{N - n_i} \frac{1}{v} \prod_{j=0}^{v-1} \left( 1 + \frac{1 - n_i}{N - 1 - j} \right)\]where \(\hat{p}_i\) are the empirical probabilities, \(n_i\) are the counts for each unique value, \(K\) is the number of unique values, and \(N\) is the total number of observations.
The actual algorithm implementation follows the fast calculation approach from [LCBFiC17].
- Attributes:
- *dataarray_like
The data used to estimate the entropy.