Mutual Information

kxy.api.core.mutual_information.discrete_mutual_information(x, y)

Estimates the (Shannon) mutual information between two discrete random variables from i.i.d. samples, using the plug-in estimator of Shannon entropy.

Parameters:
  • x ((n,) or (n, d) np.array or None (default)) – n i.i.d. draws from a discrete distribution.
  • y ((n,) np.array) – n i.i.d. draws from another discrete distribution, sampled jointly with x.
Returns:

i – The mutual information between x and y, in nats.

Return type:

float

Raises:

AssertionError – If y or x is not a one-dimensional array, or if x and y have different shapes.

kxy.api.core.mutual_information.least_continuous_conditional_mutual_information(x_c, y, z_c, x_d=None, z_d=None, space='dual', non_monotonic_extension=True, categorical_encoding='two-split')

Estimates the conditional mutual information between a \(d\)-dimensional random vector \(x\) of inputs and a continuous scalar random variable \(y\) (or label), conditional on a third continuous random variable \(z\), assuming the least amount of structure in \((x, y, z)\), other than what is necessary to be consistent with some observed properties.

Note

\[ \begin{align}\begin{aligned}I(x; y|z) &= h(x|z) + h(y|z) - h(x, y|z) \\\ &= h\left(u_x | u_z \right) - h\left(u_{x,y} | u_z \right)\\ &= I(y; x, z) - I(y; z)\end{aligned}\end{align} \]

where \(u_x\) (resp. \(u_{x,y}\), \(u_z\)) is the copula-uniform representation of \(x\) (resp. \((x, y)\), \(z\)).

We use as model for \(u_{x, y, z}\) the least structured copula that is consistent with maximum entropy constraints in the chosen space.

Parameters:
  • x_c ((n, d) np.array) – n i.i.d. draws from the continuous inputs data generating distribution.
  • x_d ((n, q) np.array or None (default)) – n i.i.d. draws from the discrete inputs data generating distribution, jointly sampled with x_c, or None if there are no discrete features.
  • y ((n,) np.array) – n i.i.d. draws from the (continuous) labels generating distribution, sampled jointly with x.
  • z_c ((n, d) np.array) – n i.i.d. draws from the generating distribution of continuous conditions.
  • z_d ((n, d) np.array or None (default), optional) – n i.i.d. draws from the generating distribution of categorical conditions.
  • space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints.
  • categorical_encoding (str, 'one-hot' | 'two-split' (default)) – The encoding method to use to represent categorical variables. See kxy.api.core.utils.one_hot_encoding and kxy.api.core.utils.two_split_encoding.
Returns:

i – The mutual information between x and y, in nats.

Return type:

float

Raises:

AssertionError – If y is not a one dimensional array or x, y and z are not all numeric.

kxy.api.core.mutual_information.least_continuous_mutual_information(x_c, y, x_d=None, space='dual', non_monotonic_extension=True, categorical_encoding='two-split')

Estimates the mutual information between a \(d\)-dimensional random vector \(x\) of inputs and a continuous scalar random variable \(y\) (or label), assuming the least amount of structure in \(x\) and \((x, y)\), other than what is necessary to be consistent with some observed properties.

Note

\[ \begin{align}\begin{aligned}I(x, y) &= h(x) + h(y) - h(x, y) \\\ &= h\left(u_x\right) - h\left(u_{x,y}\right)\end{aligned}\end{align} \]

where \(u_x\) (resp. \(u_{x,y}\)) is the copula-uniform representation of \(x\) (resp. \((x, y)\)).

We use as model for \(u_{x, y}\) the least structured copula that is consistent with the Spearman rank correlation matrix of the joint vector \((x, y)\).

Parameters:
  • x_c ((n,) or (n, d) np.array) – n i.i.d. draws from the continuous inputs data generating distribution.
  • x_d ((n, q) np.array or None (default)) – n i.i.d. draws from the discrete inputs data generating distribution, jointly sampled with x_c, or None if there are no discrete features.
  • y ((n,) np.array) – n i.i.d. draws from the (continuous) labels generating distribution, sampled jointly with x.
  • space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints.
  • categorical_encoding (str, 'one-hot' | 'two-split' (default)) – The encoding method to use to represent categorical variables. See kxy.api.core.utils.one_hot_encoding and kxy.api.core.utils.two_split_encoding.
Returns:

i – The mutual information between x and y, in nats.

Return type:

float

Raises:

AssertionError – When input parameters are invalid.

kxy.api.core.mutual_information.least_mixed_conditional_mutual_information(x_c, y, z_c, x_d=None, z_d=None, space='dual', non_monotonic_extension=True, categorical_encoding='two-split')

Estimates the conditional mutual information between a dimensional random vector \(x\) of inputs and a discrete scalar random variable \(y\) (or label), conditional on a third random variable \(z\), assuming the least amount of structure in \((x, y, z)\), other than what is necessary to be consistent with some observed properties.

\[I(y; x_c, x_d | z_c, z_d) = I(y; x_c, x_d, z_c, z_d) - I(y; z_c, z_d)\]
Parameters:
  • x_c ((n, d) np.array) – n i.i.d. draws from the generating distribution of continuous inputs.
  • x_d ((n, d) np.array or None (default), optional) – n i.i.d. draws from the generating distribution of categorical inputs.
  • z_c ((n, d) np.array) – n i.i.d. draws from the generating distribution of continuous conditions.
  • z_d ((n, d) np.array or None (default), optional) – n i.i.d. draws from the generating distribution of categorical conditions.
  • y ((n,) np.array) – n i.i.d. draws from the (categorical) labels generating distribution, sampled jointly with x.
  • space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints.
  • categorical_encoding (str, 'one-hot' | 'two-split' (default)) – The encoding method to use to represent categorical variables. See kxy.api.core.utils.one_hot_encoding and kxy.api.core.utils.two_split_encoding.
Returns:

i – The mutual information between (x_c, x_d) and y, conditional on (z_c, z_d), in nats.

Return type:

float

Raises:

AssertionError – If y is not a one dimensional array or x and z are not all numeric.

kxy.api.core.mutual_information.least_mixed_mutual_information(x_c, y, x_d=None, space='dual', non_monotonic_extension=True, categorical_encoding='two-split')

Estimates the mutual inforrmation between some features (a d-dimensional random vector \(x_c\) and possibly a discrete feature variable \(x_d\) and a discrete output/label random variable.

Note

\[ \begin{align}\begin{aligned}I({x_c, x_d}; y) &= h(y) + h(x_c, x_d) - h(x_c, x_d, y) \\\ &= h(x_c, x_d) - h\left(x_c, x_d \vert y\right) \\\ &= h(x_c, x_d) - E\left[h\left(x_c, x_d \vert y=.\right)\right] \\\ :&= h(x_c, x_d) - \sum_{j=1}^q \mathbb{P}(y=j) h\left(x_c, x_d \vert y=j \right)\end{aligned}\end{align} \]

When there are discrete features:

When there are no discrete features, \(x_d\) can simply be removed from the equations above, and kxy.api.core.entropy.least_structured_continuous_entropy is used for estimation instead of kxy.api.core.entropy.least_structured_mixed_entropy.

Parameters:
  • x_c ((n, d) np.array) – n i.i.d. draws from the continuous inputs data generating distribution.
  • x_d ((n, q) np.array or None (default)) – n i.i.d. draws from the discrete inputs data generating distribution, jointly sampled with x_c, or None if there are no discrete features.
  • y ((n,) np.array) – n i.i.d. draws from the (discrete) labels generating distribution, sampled jointly with x.
  • space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints.
  • categorical_encoding (str, 'one-hot' | 'two-split' (default)) – The encoding method to use to represent categorical variables. See kxy.api.core.utils.one_hot_encoding and kxy.api.core.utils.two_split_encoding.
Returns:

i – The mutual information between features and discrete labels, in nats.

Return type:

float

Raises:

AssertionError – If y is not a one-dimensional array, or x_c is not an array of numbers.

kxy.api.core.mutual_information.least_mutual_information(x_c, x_d, y_c, y_d, space='dual', non_monotonic_extension=True, categorical_encoding='two-split')
kxy.api.core.mutual_information.least_total_correlation(x, space='dual')

Estimates the smallest total correlation of a continuous d-dimensional random vector, that is consistent with the average pairwise Spearman rank correlation estimated from the input sample.

\[TC(x) := \sum_{i=1}^d h(x_i)-h(x) = -h(u).\]

This is the negative of kxy.api.core.entropy.least_structured_copula_entropy.

Parameters:
  • x ((n, d) np.array) – n i.i.d. draws from the data generating distribution.
  • space (str, 'primal' | 'dual') – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson covariance constraints, leading to the Gaussian copula. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank correlation constraints.
Returns:

tc – The smallest total correlation consistent with observed empirical evidence, in nats.

Return type:

float

Raises:

AssertionError – If x is not a two dimensional array.

Mutual Information Rate

kxy.api.core.mutual_information_rate.least_continuous_mutual_information_rate(x, y, space='primal', robust=False, p=None, p_ic='hqic')

Estimate the maximum entropy mutual information rate between two scalar or vector valued time series.

Parameters:
  • x ((T, d) np.array) – Sample path from the first time series.
  • y ((T, q) np.array) – Sample path from the second time series.
  • space (str, 'primal' | 'dual') – One of 'primal' and 'dual' to choose where the maximum entropy optimization problem should be formulated.
  • robust (bool) – If True and space='primal', then the autocovariance function is estimated by computing the Spearman rank autocorrelation function, inferring the equivalent autocorrelation function assuming Gaussianity, and then scaling back with the sample variances.
  • p (int or None) – The number of autocorrelation lags to use for the maximum entropy problem. If set to None (the default) and if space is primal, then it is inferred by fitting a VAR model on the joint time series using the Hannan-Quinn information criterion.
  • p_ic (str) – The criterion used to learn the optimal value of p (by fitting a VAR(p) model) when p=None. Should be one of ‘hqic’ (Hannan-Quinn Information Criterion), ‘aic’ (Akaike Information Criterion), ‘bic’ (Bayes Information Criterion) and ‘t-stat’ (based on last lag). Same as the ‘ic’ parameter of statsmodels.tsa.api.VAR.
  • space – The space in which the maximum entropy problem is solved. When space='primal', the maximum entropy problem is solved in the original observation space, under Pearson autocovariance constraints, leading to the Gaussian VAR. When space='dual', the maximum entropy problem is solved in the copula-uniform dual space, under Spearman rank autocorrelation constraints.

Warning

Maximum entropy optimization in the copula-uniform dual space is not yet supported for time series.

Returns:i – The mutual information rate between x and y, in nats.
Return type:float
Raises:AssertionError – If space is not 'primal'.