# Mutual Information¶

kxy.api.core.mutual_information.discrete_mutual_information(x, y)

Estimates the (Shannon) mutual information between two discrete random variables from i.i.d. samples, using the plug-in estimator of Shannon entropy.

Parameters: x ((n,) or (n, d) np.array or None (default)) – n i.i.d. draws from a discrete distribution. y ((n,) np.array) – n i.i.d. draws from another discrete distribution, sampled jointly with x. i – The mutual information between x and y, in nats. float AssertionError – If y or x is not a one-dimensional array, or if x and y have different shapes.
kxy.api.core.mutual_information.least_continuous_conditional_mutual_information(x_c, y, z_c, x_d=None, z_d=None, space='dual', non_monotonic_extension=True, categorical_encoding='two-split')

Estimates the conditional mutual information between a $$d$$-dimensional random vector $$x$$ of inputs and a continuous scalar random variable $$y$$ (or label), conditional on a third continuous random variable $$z$$, assuming the least amount of structure in $$(x, y, z)$$, other than what is necessary to be consistent with some observed properties.

Note

\begin{align}\begin{aligned}I(x; y|z) &= h(x|z) + h(y|z) - h(x, y|z) \\\ &= h\left(u_x | u_z \right) - h\left(u_{x,y} | u_z \right)\\ &= I(y; x, z) - I(y; z)\end{aligned}\end{align}

where $$u_x$$ (resp. $$u_{x,y}$$, $$u_z$$) is the copula-uniform representation of $$x$$ (resp. $$(x, y)$$, $$z$$).

We use as model for $$u_{x, y, z}$$ the least structured copula that is consistent with maximum entropy constraints in the chosen space.

Parameters: x_c ((n, d) np.array) – n i.i.d. draws from the continuous inputs data generating distribution. x_d ((n, q) np.array or None (default)) – n i.i.d. draws from the discrete inputs data generating distribution, jointly sampled with x_c, or None if there are no discrete features. y ((n,) np.array) – n i.i.d. draws from the (continuous) labels generating distribution, sampled jointly with x. z_c ((n, d) np.array) – n i.i.d. draws from the generating distribution of continuous conditions. z_d ((n, d) np.array or None (default), optional) – n i.i.d. draws from the generating distribution of categorical conditions. space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints. categorical_encoding (str, 'one-hot' | 'two-split' (default)) – The encoding method to use to represent categorical variables. See kxy.api.core.utils.one_hot_encoding and kxy.api.core.utils.two_split_encoding. i – The mutual information between x and y, in nats. float AssertionError – If y is not a one dimensional array or x, y and z are not all numeric.
kxy.api.core.mutual_information.least_continuous_mutual_information(x_c, y, x_d=None, space='dual', non_monotonic_extension=True, categorical_encoding='two-split')

Estimates the mutual information between a $$d$$-dimensional random vector $$x$$ of inputs and a continuous scalar random variable $$y$$ (or label), assuming the least amount of structure in $$x$$ and $$(x, y)$$, other than what is necessary to be consistent with some observed properties.

Note

\begin{align}\begin{aligned}I(x, y) &= h(x) + h(y) - h(x, y) \\\ &= h\left(u_x\right) - h\left(u_{x,y}\right)\end{aligned}\end{align}

where $$u_x$$ (resp. $$u_{x,y}$$) is the copula-uniform representation of $$x$$ (resp. $$(x, y)$$).

We use as model for $$u_{x, y}$$ the least structured copula that is consistent with the Spearman rank correlation matrix of the joint vector $$(x, y)$$.

Parameters: x_c ((n,) or (n, d) np.array) – n i.i.d. draws from the continuous inputs data generating distribution. x_d ((n, q) np.array or None (default)) – n i.i.d. draws from the discrete inputs data generating distribution, jointly sampled with x_c, or None if there are no discrete features. y ((n,) np.array) – n i.i.d. draws from the (continuous) labels generating distribution, sampled jointly with x. space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints. categorical_encoding (str, 'one-hot' | 'two-split' (default)) – The encoding method to use to represent categorical variables. See kxy.api.core.utils.one_hot_encoding and kxy.api.core.utils.two_split_encoding. i – The mutual information between x and y, in nats. float AssertionError – When input parameters are invalid.
kxy.api.core.mutual_information.least_mixed_conditional_mutual_information(x_c, y, z_c, x_d=None, z_d=None, space='dual', non_monotonic_extension=True, categorical_encoding='two-split')

Estimates the conditional mutual information between a dimensional random vector $$x$$ of inputs and a discrete scalar random variable $$y$$ (or label), conditional on a third random variable $$z$$, assuming the least amount of structure in $$(x, y, z)$$, other than what is necessary to be consistent with some observed properties.

$I(y; x_c, x_d | z_c, z_d) = I(y; x_c, x_d, z_c, z_d) - I(y; z_c, z_d)$
Parameters: x_c ((n, d) np.array) – n i.i.d. draws from the generating distribution of continuous inputs. x_d ((n, d) np.array or None (default), optional) – n i.i.d. draws from the generating distribution of categorical inputs. z_c ((n, d) np.array) – n i.i.d. draws from the generating distribution of continuous conditions. z_d ((n, d) np.array or None (default), optional) – n i.i.d. draws from the generating distribution of categorical conditions. y ((n,) np.array) – n i.i.d. draws from the (categorical) labels generating distribution, sampled jointly with x. space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints. categorical_encoding (str, 'one-hot' | 'two-split' (default)) – The encoding method to use to represent categorical variables. See kxy.api.core.utils.one_hot_encoding and kxy.api.core.utils.two_split_encoding. i – The mutual information between (x_c, x_d) and y, conditional on (z_c, z_d), in nats. float AssertionError – If y is not a one dimensional array or x and z are not all numeric.
kxy.api.core.mutual_information.least_mixed_mutual_information(x_c, y, x_d=None, space='dual', non_monotonic_extension=True, categorical_encoding='two-split')

Estimates the mutual inforrmation between some features (a d-dimensional random vector $$x_c$$ and possibly a discrete feature variable $$x_d$$ and a discrete output/label random variable.

Note

\begin{align}\begin{aligned}I({x_c, x_d}; y) &= h(y) + h(x_c, x_d) - h(x_c, x_d, y) \\\ &= h(x_c, x_d) - h\left(x_c, x_d \vert y\right) \\\ &= h(x_c, x_d) - E\left[h\left(x_c, x_d \vert y=.\right)\right] \\\ :&= h(x_c, x_d) - \sum_{j=1}^q \mathbb{P}(y=j) h\left(x_c, x_d \vert y=j \right)\end{aligned}\end{align}

When there are discrete features:

When there are no discrete features, $$x_d$$ can simply be removed from the equations above, and kxy.api.core.entropy.least_structured_continuous_entropy is used for estimation instead of kxy.api.core.entropy.least_structured_mixed_entropy.

Parameters: x_c ((n, d) np.array) – n i.i.d. draws from the continuous inputs data generating distribution. x_d ((n, q) np.array or None (default)) – n i.i.d. draws from the discrete inputs data generating distribution, jointly sampled with x_c, or None if there are no discrete features. y ((n,) np.array) – n i.i.d. draws from the (discrete) labels generating distribution, sampled jointly with x. space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints. categorical_encoding (str, 'one-hot' | 'two-split' (default)) – The encoding method to use to represent categorical variables. See kxy.api.core.utils.one_hot_encoding and kxy.api.core.utils.two_split_encoding. i – The mutual information between features and discrete labels, in nats. float AssertionError – If y is not a one-dimensional array, or x_c is not an array of numbers.
kxy.api.core.mutual_information.least_mutual_information(x_c, x_d, y_c, y_d, space='dual', non_monotonic_extension=True, categorical_encoding='two-split')
kxy.api.core.mutual_information.least_total_correlation(x, space='dual')

Estimates the smallest total correlation of a continuous d-dimensional random vector, that is consistent with the average pairwise Spearman rank correlation estimated from the input sample.

$TC(x) := \sum_{i=1}^d h(x_i)-h(x) = -h(u).$

This is the negative of kxy.api.core.entropy.least_structured_copula_entropy.

Parameters: x ((n, d) np.array) – n i.i.d. draws from the data generating distribution. space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints. tc – The smallest total correlation consistent with observed empirical evidence, in nats. float AssertionError – If x is not a two dimensional array.

# Mutual Information Rate¶

kxy.api.core.mutual_information_rate.least_continuous_mutual_information_rate(x, y, space='primal', robust=False, p=None, p_ic='hqic')

Estimate the maximum entropy mutual information rate between two scalar or vector valued time series.

Parameters: x ((T, d) np.array) – Sample path from the first time series. y ((T, q) np.array) – Sample path from the second time series. space (str, 'primal' | 'dual') – One of 'primal' and 'dual' to choose where the maximum entropy optimization problem should be formulated. robust (bool) – If True and space='primal', then the autocovariance function is estimated by computing the Spearman rank autocorrelation function, inferring the equivalent autocorrelation function assuming Gaussianity, and then scaling back with the sample variances. p (int or None) – The number of autocorrelation lags to use for the maximum entropy problem. If set to None (the default) and if space is primal, then it is inferred by fitting a VAR model on the joint time series using the Hannan-Quinn information criterion. p_ic (str) – The criterion used to learn the optimal value of p (by fitting a VAR(p) model) when p=None. Should be one of ‘hqic’ (Hannan-Quinn Information Criterion), ‘aic’ (Akaike Information Criterion), ‘bic’ (Bayes Information Criterion) and ‘t-stat’ (based on last lag). Same as the ‘ic’ parameter of statsmodels.tsa.api.VAR. space – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson autocovariance constraints, leading to the Gaussian VAR. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank autocorrelation constraints.

Warning

Maximum entropy optimization in the copula-uniform dual space is not yet supported for time series.

Returns: i – The mutual information rate between x and y, in nats. float AssertionError – If space is not 'primal'.