# Entropy¶

kxy.api.core.entropy.discrete_entropy(x)

Estimates the (Shannon) entropy of a discrete random variable taking up to q distinct values given n i.i.d samples,

$h(x) = - \sum_{i=1}^q p_i \log p_i,$

using the plug-in estimator.

Parameters: x ((n,) np.array) – i.i.d. samples from the distribution of interest. h – The (differential) entropy of the discrete random variable of interest. float AssertionError – If the input has the wrong shape.
kxy.api.core.entropy.least_structured_continuous_entropy(x, space='dual', batch_indices=[])

Estimates the entropy of a continuous $$d$$-dimensional random variable under the least structured assumption for its copula.

When $$d>1$$,

$h(x) = h(u) + \sum_{i=1}^d h(x_i).$
Parameters: x ((n, d) np.array) – n i.i.d. draws from the data generating distribution. space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints. h – The (differential) entropy of the data generating distribution, assuming its copula is maximum-entropy in the chosen space. By convention, when $$d=1$$, this function is the same as scalar_continuous_entropy, and returns 0 when $$n=1$$. float
kxy.api.core.entropy.least_structured_copula_entropy(x, space='dual', batch_indices=[])

Estimates the entropy of the maximum-entropy copula in the chosen space.

Note

This also corresponds to least amount of total correlation that is evidenced by our maximum-entropy constraints.

Parameters: x ((n, d) np.array) – n i.i.d. draws from the data generating distribution. space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints. h – The (differential) entropy of the least structured copula consistent with maximum-entropy constraints. float
kxy.api.core.entropy.least_structured_mixed_entropy(x_c, x_d, space='dual', batch_indices=[])

Estimates the joint entropy $$h(x_c, x_d)$$, where $$x_c$$ is continuous random vector and $$x_d$$ is a discrete random vector.

Note

We use the identities

\begin{align}\begin{aligned}h(x, y) &= h(y) + h(x|y) \\\ &= h(y) + E \left[ h(x \vert y=.) \right]\end{aligned}\end{align}

that are true when $$x$$ and $$y$$ are either both continuous or both discrete to extend the definition of the joint entropy to the case where one is continuous and the other discrete.

Specifically,

$h(x_c, x_d) = h(x_d) + \sum_{j=1}^q \mathbb{P}(x_d=j) h\left(x_c \vert x_d=j \right).$
Parameters: x_c ((n, d) np.array) – n i.i.d. draws from the continuous data generating distribution. x_d ((n,) np.array) – n i.i.d. draws from the discrete data generating distribution, jointly sampled with x_c. space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints. h – The entropy of the least structured distribution as evidenced by average the maximum entropy constraints. float
kxy.api.core.entropy.scalar_continuous_entropy(x, space='dual', method='gaussian-kde')

Estimates the (differential) entropy of a continuous scalar random variable.

Multiple methods are supported:

• 'gaussian' for Gaussian moment matching: $$h(x) = \frac{1}{2} \log\left(2 \pi e \sigma^2 \right)$$.
• '1-spacing' for the standard 1-spacing estimator (see  and ):
$h(x) \approx - \gamma(1) + \frac{1}{n-1} \sum_{i=1}^{n-1} \log \left[ n \left(x_{(i+1)} - x_{(i)} \right) \right],$

where $$x_{(i)}$$ is the i-th smallest sample, and $$\gamma$$ is the digamma function.

• 'gaussian-kde' (the default) for Gaussian kernel density estimation.
$h(x) \approx \frac{1}{n} \sum_{i=1}^n \log\left( \hat{p}\left(x_i\right) \right)$

where $$\hat{p}$$ is the Gaussian kernel density estimator of the true pdf using statsmodels.api.nonparametric.KDEUnivariate.

Parameters: x ((n,) np.array) – i.i.d. samples from the distribution of interest. h – The (differential) entropy of the continuous scalar random variable of interest. float AssertionError – If the input has the wrong shape or the method is not supported.

References

  Kozachenko, L. F., and Nikolai N. Leonenko. “Sample estimate of the entropy of a random vector.” Problemy Peredachi Informatsii 23.2 (1987): 9-16.
  Beirlant, J., Dudewicz, E.J., Györfi, L., van der Meulen, E.C. “Nonparametric entropy estimation: an overview.” International Journal of Mathematical and Statistical Sciences. 6 (1): 17–40. (1997) ISSN 1055-7490.

# Entropy Rate¶

kxy.api.core.entropy_rate.auto_predictability(sample, p=None, robust=False, p_ic='hqic', space='primal')

Estimates the measure of auto-predictability of a (vector-value) time series:

\begin{align}\begin{aligned}\mathbb{PR}\left(\{x_t \} \right) :&= h\left(x_* \right) - h\left( \{ x_t \} \right) \\\ &= h\left(u_{x_*}\right) - h\left( \{ u_{x_t} \} \right).\end{aligned}\end{align}
Parameters: sample ((T, d) np.array) – Array of T sample observations of a $$d$$-dimensional process. p (int or None) – Number of lags to compute for the autocovariance function. If p=None (the default), it is inferred by fitting a VAR model on the sample, using as information criterion p_ic. robust (bool) – If True, the Pearson autocovariance function is estimated by first estimating a Spearman rank correlation, and then inferring the equivalent Pearson autocovariance function, under the Gaussian assumption. p_ic (str) – The criterion used to learn the optimal value of p (by fitting a VAR(p) model) when p=None. Should be one of ‘hqic’ (Hannan-Quinn Information Criterion), ‘aic’ (Akaike Information Criterion), ‘bic’ (Bayes Information Criterion) and ‘t-stat’ (based on last lag). Same as the ‘ic’ parameter of statsmodels.tsa.api.VAR. space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints.

Warning

This function only supports primal inference for now.

kxy.api.core.entropy_rate.estimate_pearson_autocovariance(sample, p, robust=False)

Estimates the sample autocovariance function of a vector-value process $$\{x_t\}$$ up to lag p (starting from 0).

Parameters: sample ((T, d) np.array) – Array of T sample observations of a d-dimensional process. p (int) – Number of lags to compute for the autocovariance function. robust (bool) – If True, the Pearson autocovariance function is estimated by first estimating a Spearman rank correlation, and then inferring the equivalent Pearson autocovariance function, under the Gaussian assumption. ac – Sample autocovariance matrix whose ij block of size pxp is the covariance between $$x_{t+i}$$ and $$x_{t+j}$$. (dp, dp) np.array
kxy.api.core.entropy_rate.gaussian_var_copula_entropy_rate(sample, p=None, robust=False, p_ic='hqic')

Estimates the entropy rate of the copula-uniform dual representation of a stationary Gaussian VAR(p) (or AR(p)) process from a sample path.

We recall that the copula-uniform representation of a $$\mathbb{R}^d$$-valued process $$\{x_t\} := \{(x_{1t}, \dots, x_{dt}) \}$$ is, by definition, the process $$\{ u_t \} := \{ \left( F_{1t}\left(x_{1t}\right), \dots, F_{dt}\left(x_{dt}\right) \right) \}$$ where $$F_{it}$$ is the cummulative density function of $$x_{it}$$.

It can be shown that

$h\left( \{ x_t \}\right) = h\left( \{ u_t \}\right) + \sum_{i=1}^d h\left( x_{i*}\right)$

where $$h\left(x_{i*}\right)$$ is the entropy of the i-th coordinate process at any time.

Parameters: sample ((T, d) np.array) – Array of T sample observations of a $$d$$-dimensional process. p (int or None) – Number of lags to compute for the autocovariance function. If p=None (the default), it is inferred by fitting a VAR model on the sample, using as information criterion p_ic. robust (bool) – If True, the Pearson autocovariance function is estimated by first estimating a Spearman rank correlation, and then inferring the equivalent Pearson autocovariance function, under the Gaussian assumption. p_ic (str) – The criterion used to learn the optimal value of p (by fitting a VAR(p) model) when p=None. Should be one of ‘hqic’ (Hannan-Quinn Information Criterion), ‘aic’ (Akaike Information Criterion), ‘bic’ (Bayes Information Criterion) and ‘t-stat’ (based on last lag). Same as the ‘ic’ parameter of statsmodels.tsa.api.VAR. h (float) – The entropy rate of the copula-uniform dual representation of the input process. p (int) – Order of the VAR(p).
kxy.api.core.entropy_rate.gaussian_var_entropy_rate(sample, p, robust=False)

Estimates the entropy rate of a stationary Gaussian VAR(p) or AR(p), namely

$h\left( \{x_t\} \right) = \frac{1}{2} \log \left( \frac{|K_p|}{|K_{p-1}|} \right) + \frac{d}{2} \log \left( 2 \pi e\right)$

where $$|K_p|$$ is the determinant of the lag-p autocovariance matrix corresponding to this process, from a sample path of size T.

Parameters: sample ((T, d) np.array) – Array of T sample observations of a d-dimensional process. p (int) – Number of lags to compute for the autocovariance function. robust (bool) – If True, the Pearson autocovariance function is estimated by first estimating a Spearman rank correlation, and then inferring the equivalent Pearson autocovariance function, under the Gaussian assumption.

kxy.api.core.entropy_rate.pearson_acovf(sample, max_order=10, robust=False)
Estimates the sample Pearson autocovariance function of a scalar-valued or vector-valued discrete-time stationary ergodic stochastic process $$\{z_t\}$$ from a single sample of size $$T$$, $$(\hat{z}_1, \dots, \hat{z}_T)$$.
$C(h) := \frac{1}{T} \sum_{t=1+h}^T (\hat{z}_t - \bar{z})(\hat{z}_{t-h} - \bar{z})^T$
with $$\bar{z} := \frac{1}{T} \sum_{t=1}^T \hat{z}_t$$.