Changelog#

Under Development#

  • 📈 Variational MI estimators: For large datasets, stochastic variational inference [HBWP13] becomes a valid approach to determine variational bounds of mutual information (MI). The following variational estimators for MI are planned:

    • DV [DV75]: The Donsker-Varadhan estimator provides a dual formulation of the KL-divergence for variational MI bounds, forming the theoretical foundation for many neural MI estimators.

    • BA [BA03]: The Barber-Agakov estimator uses variational approximation to compute MI over noisy channels, similar to the EM algorithm but maximizing MI instead of a likelihood. It introduces a tractable lower bound by replacing intractable conditional distributions with variational approximations.

    • MINE [BBR+18]: Mutual Information Neural Estimation employs gradient descent over neural networks to estimate MI between high-dimensional continuous variables. It is scalable in dimensionality and sample size, trainable through back-propagation, and strongly consistent.

    • NWJ [NWJ10]: The Nguyen-Wainwright-Jordan estimator uses convex risk minimization to estimate divergence functionals and likelihood ratios through f-divergence characterization. This approach leverages convexity to ensure robust and efficient estimation. Furthermore, a CMI bound can be obtained using [MBS20], enabling estimation of TE and CTE.

    • JSD [HFLM+18]: This estimator uses Jensen-Shannon divergence for representation learning by maximizing MI between input and encoder output, incorporating locality structure and adversarial matching for unsupervised learning.

    • TUBA [POO+19]: Tractable Unnormalized Barber and Agakov estimator provides unbiased estimates and gradients using energy-based variational families to avoid intractable partition functions while maintaining tractability.

    • NCE [MC18, vdOLV19]: A multi-sample mutual information estimator based on noise contrastive estimation (NCE) [GH12].

    • \(\bf{I_{\alpha}}\) [POO+19]: This interpolated bound balances variance and square bias, providing a flexible trade-off between bias and variance controlled by the parameter α.

    • FLO [GCW+22]: Fenchel-Legendre Optimization offers a novel contrastive MI estimator that overcomes InfoNCE limitations by achieving tight bounds and provable convergence. It uses unnormalized statistical modeling and convex optimization to improve data efficiency.

  • EEVI: Estimators of entropy via inference, i.e. using sequential Monte Carlo [SCTM22].

  • Automatic evaluation of multiple time lags: For MI and TE estimators, automatic evaluation of multiple time lags to find optimal lag parameters and improve information measure accuracy.

Version 0.5.1 (2026-01-17)#

Added support for Python 3.14.

Version 0.5.0 (2025-07-02)#

This release introduces an overhaul of the statistical testing functionality with breaking changes to the API.

  • 🚨 BREAKING CHANGES:

    • Removed p_value() and t_score() methods from PValueMixin.

    • Replaced with comprehensive statistical_test() method that returns a StatisticalTestResult object.

    • Renamed configuration parameter p_value_method to statistical_test_method.

    • Added new configuration parameter statistical_test_n_tests for default number of tests, see Config.

  • New Features:

    • 📊 Comprehensive Statistical Testing: New statistical_test() method provides p-values, t-scores, and metadata in a single call.

    • 📈 StatisticalTestResult Class: Rich result object containing:

      • p-value and t-score

      • Test values from resampling

      • Observed value and null distribution statistics

      • Number of tests and method used

    • 📊 Flexible Percentile Access: percentile() method wraps numpy’s percentile function for test values.

    • 🎯 Convenience Confidence Intervals: confidence_interval() method for easy CI calculation.

  • 🔧 API Improvements:

    • Simplified Interface: No need to specify confidence levels upfront - calculate on demand.

    • Better Metadata: Statistical results include test method and number of tests used.

    • Consistent Return Types: All statistical operations return structured objects.

  • 🧮 Added Estimators:

    • Miller-Madow Estimators: Comprehensive suite of bias-corrected information measure estimators using the Miller-Madow correction formula. Provides improved estimates for small sample sizes by adding correction terms to maximum likelihood estimates. These estimators are dedicated implementations.

      • Entropy (H): MillerMadowEntropyEstimator with correction term (K-1)/(2N) for bias-corrected entropy estimation.

      • Mutual Information (MI): MillerMadowMIEstimator for bias-corrected mutual information with support for arbitrary number of variables.

      • Conditional Mutual Information (CMI): MillerMadowCMIEstimator for bias-corrected conditional mutual information.

      • Transfer Entropy (TE): MillerMadowTEEstimator for bias-corrected transfer entropy with statistical testing support.

      • Conditional Transfer Entropy (CTE): MillerMadowCTEEstimator for bias-corrected conditional transfer entropy.

      • Kullback-Leibler Divergence (KLD): Miller-Madow correction available through approach="millermadow" or approach="mm" in kld().

      • Jensen-Shannon Divergence (JSD): Miller-Madow correction available through approach="millermadow" or approach="mm" in jsd().

      All Miller-Madow estimators include comprehensive test coverage and support for local values calculation where applicable.

    • Additional Entropy Estimators: New discrete entropy estimators with specialized bias correction and estimation techniques:

      • Bayesian Entropy: BayesEntropyEstimator - Bayesian entropy estimator with concentration parameter α supporting multiple prior choices (Jeffrey, Laplace, Schurmann-Grassberger, Minimax) for improved entropy estimation with prior knowledge incorporation.

      • Chao-Shen Entropy: ChaoShenEntropyEstimator - Bias-corrected entropy estimator that accounts for unobserved species through coverage estimation using singleton counts, providing improved estimates for incomplete sampling scenarios [CS03].

      • Shrinkage Entropy: ShrinkEntropyEstimator - James-Stein shrinkage entropy estimator that applies shrinkage to probability estimates before computing entropy, reducing bias in small sample scenarios through regularization toward uniform distribution [HS09].

      • Grassberger Entropy: GrassbergerEntropyEstimator - Discrete entropy estimator with finite sample corrections using the digamma function, providing bias-corrected entropy estimates through count-based corrections [Gra08, Gra88].

      • Chao Wang Jost Entropy: ChaoWangJostEntropyEstimator - Advanced bias-corrected entropy estimator that uses coverage estimation based on singleton and doubleton counts to account for unobserved species, providing improved entropy estimates for incomplete sampling scenarios with sophisticated statistical corrections [CWJ13].

      • ANSB Entropy: AnsbEntropyEstimator - Asymptotic NSB entropy estimator designed for extremely undersampled discrete data where the number of unique values K is comparable to the sample size N. Uses the formula (γ - log(2)) + 2 log(N) - ψ(Δ) where γ is Euler’s constant, ψ is the digamma function, and Δ is the number of coincidences, providing efficient entropy estimation in the undersampled regime [NBvS04].

      • NSB Entropy: NsbEntropyEstimator - Nemenman-Shafee-Bialek entropy estimator providing Bayesian estimates of Shannon entropy for discrete data using numerical integration. Particularly effective for undersampled data where traditional estimators may be biased, using a principled Bayesian approach that accounts for sampling uncertainty through integration over possible entropy values [NSB02].

      • Zhang Entropy: ZhangEntropyEstimator - Zhang entropy estimator for discrete data using the recommended definition from Grabchak et al. [GZZ13]. Implements the fast calculation approach from Lozano et al. [LCBFiC17] with bias correction through sophisticated probability weighting. Provides improved entropy estimates through advanced statistical corrections while maintaining computational efficiency.

      • Bonachela Entropy: BonachelaEntropyEstimator - Bonachela entropy estimator designed for small data sets using the formula from Bonachela et al. [BHM08]. Provides a compromise between low bias and small statistical errors for short data series, particularly effective when data sets are small and probabilities are not close to zero.

      These new estimators were selected based on [DGST24] and the implementations of DiscreteEntropy.Jl were consulted for help [KT24].

    • Complete Estimator Coverage: The new estimators also all support MI, CMI, TE and CTE, using the same unified slicing, integrated into the interface.

    • Jensen-Shannon Divergence Support: Of the new estimators jensen_shannon_divergence() is available for BayesEntropyEstimator and ShrinkEntropyEstimator.

    • Enhanced Cross-Entropy and KLD Support: Of the new estimators, Bayes and Miller-Madow support cross entropy and thus also the kullback_leiber_divergence(). All entropies which have been implemented before version 0.5.0 all support cross entropy and KLD already.

  • 📚 Update Documentation

  • 🧪 Updated tests


Version 0.4.0 (2025-05-02)#

The 0.4.0 release introduces cross-entropy support, improves code packaging, and enhances documentation.

  • 📈 Cross-Entropy support:

    • Added cross-entropy for all approaches.

    • Integrated cross-entropy into the documentation with detailed explanations and examples.

    • Restricted the use of joint random variables (RVs) for cross-entropy to avoid ambiguity.

  • 📦 Code packaging:

    • 📦 Added tests to packaged tarball for testing in conda-forge.

    • 🔧 Updated deprecated licence classifier.

    • 🔧 Added Zenodo integration and updated README.md with logo and badges.

    • 🔧 Added README.md formatting for logos and badges.

  • 🔧 Warnings handling: Handled warnings as errors in pytest and addressed warnings in the code.

  • 📚 Documentation:

    • 📚 Added a benchmark demo page to documentation.

    • 📄 Added acknowledgments and funding information.

    • 🎨 Updated logo and icon design.

    • 🔧 Added favicon and polished documentation index page, including logo and dark mode support.

    • 🔧 Added demos for Gaussian data and Schreiber Article.

    • 📊 Changed Gaussian axis titles and corrected Schreiber Demo information unit.

    • 🔧 Changed links and reformatted documentation.


Version 0.3.3 (2025-04-16)#

The 0.3.3 release focuses on improving documentation, moving to Read the Docs, and polishing the project.

  • 📚 Improved documentation and moved to Read the Docs.

    • 📄 Added automodapi for estimators and sphinx-apidoc.

    • 📊 Added graphviz apt dependency and fixed requirement structure.

    • 📝 Added code examples and reworked guide pages.

    • 🔗 Changed URL and repository settings.

  • 📦 Updated project for publication.

  • ✨ Optimisations and bug fixes:

    • 🚀 Parallelized box and Gaussian kernel calculations.

    • 🔄 Reused parameters between p-value and t-score calculations.

    • 🔧 Fixed bootstrap resampling for inhomogeneous, higher-dimensional input data.

    • 🔧 Optimized kernel (C)TE calculations.

    • 🔧 Fixed calling t-score without p-value.


Version 0.3.0 (2025-04-01)#

The 0.3.0dev0 release focuses on performance improvements, feature enhancements, and API updates.

  • 🔧 Local values support: All approaches now support local values.

  • 🎯 Added two new composite measures:

    • Jensen-Shannon Divergence (JSD)

    • Kullback-Leibler Divergence (KLD)

  • ✨ Optimized algorithms for:

    • Mutual Information (MI) and Conditional Mutual Information (CMI) on discrete and ordinal data.

    • Transfer Entropy (TE) and Conditional Transfer Entropy (CTE).

  • ⚡ Major API refactoring to improve compatibility with arbitrary many random variables in MI and CMI.

  • 💡 Enhanced performance through optimisations in base.py.

  • 🔍 Added extensive testing for local values and tested manually with code notebooks.

  • ⬆️ Added Python 3.13 support.


Version 0.2.1 (2025-02-11)#

The 0.2.1dev0 release marks the first release, providing essential information measures and estimators like Entropy (H), Mutual Information (MI), and others. It includes a CI/CD pipeline, supports Python 3.10-3.12, and is licensed under AGPLv3+.

  • 📦 First release of the infomeasure package.

  • 🧩 Added essential information measure estimators:

    • Shannon entropy (H)

    • Mutual Information (MI)

    • Conditional Mutual Information (CMI)

    • Transfer Entropy (TE) and Conditional Transfer Entropy (CTE)

    • Jensen-Shannon Divergence (JSD)

    • Kullback-Leibler Divergence (KLD)

  • 🔄 Set up CI/CD pipeline with GitLLab CI.

  • 💻 Added support for Python 3.10+.

  • 📄 Updated documentation to include installation guide, package structure, and example use cases.


Version 0.0.0 (2024-06-06)#

  • Package setup

    • 🏗 Written pyproject.toml

    • 🔄 General project and test structure with CI/CD

    • 📚️ Documentation with sphinx, sphinxcontrib-bibtex and numpydoc