DiscreteEntropyEstimator

DiscreteEntropyEstimator#

class infomeasure.estimators.entropy.DiscreteEntropyEstimator(*args, **kwargs)[source]

Bases: DiscreteHEstimator

Standard Shannon entropy estimator for discrete data using maximum likelihood.

The discrete entropy estimator computes the Shannon entropy using the classical maximum likelihood approach:

\[\hat{H} = -\sum_{i=1}^{K} \hat{p}_i \log \hat{p}_i\]

where \(\hat{p}_i = \frac{n_i}{N}\) are the empirical probabilities, \(n_i\) are the counts for each unique value \(i\), \(K\) is the number of unique values, and \(N\) is the total number of observations.

This is the most fundamental entropy estimator and serves as the baseline for comparison with other bias-corrected estimators. While it provides an asymptotically unbiased estimate of the true entropy, it can exhibit significant bias for small sample sizes, particularly when the number of unique values is large relative to the sample size.

The estimator is suitable for:

  • Large datasets where bias is minimal

  • Baseline comparisons with bias-corrected estimators

  • Applications where computational simplicity is preferred

  • Well-sampled distributions with sufficient observations per unique value

For small sample sizes or distributions with many rare events, consider using bias-corrected estimators such as ChaoShenEntropyEstimator, BonachelaEntropyEstimator, or ZhangEntropyEstimator.

Attributes:
*dataarray_like

The data used to estimate the entropy. For joint entropy, multiple arrays can be provided.

basefloat or str, default=Config.get(“base”)

The logarithm base for entropy calculation. Common values are 2 (bits), 10 (dits), or ‘e’ (nats).

Examples

>>> import infomeasure as im
>>> # Simple entropy calculation
>>> data = [1, 1, 2, 3, 3, 4, 5]
>>> entropy_value = im.entropy(data, approach="discrete")
>>> print(f"Entropy: {entropy_value:.3f} nats")
Entropy: 1.550 nats
>>> # Local values
>>> estimator = im.estimator(data, measure="h", approach="discrete")
>>> estimator.local_vals()
array([1.25276297, 1.25276297, 1.94591015, 1.25276297, 1.25276297,
   1.94591015, 1.94591015])

Attributes Summary

dist_dict

Return the distribution dictionary for JSD.

Attributes Documentation

dist_dict#

Return the distribution dictionary for JSD.