Risk Analysis¶

kxy.asset_management.risk_analysis.
information_adjusted_correlation
(x, y=None, p=0)¶ Calculates the informationadjusted correlation matrix between two arrays.
Note
Pearson’s correlation coefficient quantifies the linear association between two random variables. Indeed, two random variables can be statistically dependent despite being decorrelated. Such dependence will typically materialize during tail events, the worst timing from a risk management perspective.
The mutual information rate between two scalar time series provides an alternative that fully captures linear and nonlinear, crosssectional and temporal dependence:
\[I\left(\left\{x_t\right\}, \left\{y_t\right\}\right) = h\left( \left\{x_t\right\} \right) + h\left( \left\{y_t\right\} \right)  h\left(\left\{x_t, y_t\right\} \right)\]where \(h\left( \left\{ x_t \right\} \right)\) is the entropy rate of the process \(\left\{ x_t \right\}\).
Specifically, the mutual information rate is 0 if and only if the two processes are statistically independent, and in particular exhibit no crosssectional or temporal dependence, linear or nonlinear.
When \(\left\{ x_t, y_t \right\}\) is Gaussian, stationary and memoryless, for instance when \(\left(x_i, y_i \right)\) are assumed i.i.d Gaussian, the mutual entropy rate reads
\[I\left(\left\{x_t\right\}, \left\{y_t\right\}\right) = \frac{1}{2} \log \left(1 \text{Corr}\left(x_t, y_t\right)^2 \right).\]We generalize this formula and define the informationadjusted correlation as the quantity \(\text{IACorr}\left( \left\{x_t\right\}, \left\{y_t\right\} \right)\) so that the mutual information rate always reads
\[I\left(\left\{x_t\right\}, \left\{y_t\right\}\right) = \frac{1}{2} \log \left(1 \text{IACorr}\left( \left\{x_t\right\}, \left\{y_t\right\} \right)^2 \right),\]whether or not the time series are jointly Gaussian and memoryless.
\[\text{IACorr}\left(\left\{x_t\right\}, \left\{y_t\right\}\right) := \text{sign}\left( \text{Corr}\left(x_., y_.\right) \right)\sqrt{1e^{2 I\left(\left\{x_t\right\}, \left\{y_t\right\}\right)}}\]where \(\text{sign}(x)=1\) if and only if \(x \geq 0\) and \(1\) otherwise. Note that the informationadjusted correlation is 0 if and only if the two time series are statistically independent, and in particular exhibit no crosssectional or temporal dependencee.
Parameters:  x ((n,) or (n, d) np.array) – n i.i.d. draws from a scalar or vector random variable.
 y ((n,) or (n, q) np.array) – n i.i.d. draws from a scalar or vector random variable jointly sampled with x.
 p (int) – The number of lags to use when generating Spearman rank autocorrelation to use as empirical evidence in the maximumentropy problem. The default value is 0, which corresponds to assuming rows are i.i.d. This is also the only supported value for now.
Returns: c – The informationadjusted correlation matrix between the two random variables.
Return type: np.array
Raises: AssertionError – If p is different from 0. Higher values will be supported later.

kxy.asset_management.risk_analysis.
robust_pearson_corr
(x, y=None, p=0, p_ic='hqic')¶ Computes a robust estimator of the Pearson correlation matrix between \(x\) and \(y\) (or \(x\) if \(y\) is None) as the Pearson correlation matrix that is equivalent to the sample Spearman correlation matrix, assuming \((x, y)\) is jointly Gaussian.
Parameters:  x ((n,) or (n, d) np.array) – n i.i.d. draws from a scalar or vector random variable.
 y ((n,) or (n, q) np.array) – n i.i.d. draws from a scalar or vector random variable jointly sampled with x.
 p (int) – The number of lags to use when generating Spearman rank autocorrelation. The default value is 0, which corresponds to assuming rows are i.i.d.
 p_ic (str) – The criterion used to learn the optimal value of
p
(by fitting a VAR(p) model) whenp=None
. Should be one of ‘hqic’ (HannanQuinn Information Criterion), ‘aic’ (Akaike Information Criterion), ‘bic’ (Bayes Information Criterion) and ‘tstat’ (based on last lag). Same as the ‘ic’ parameter ofstatsmodels.tsa.api.VAR
.
Returns: c – The robust Pearson correlation matrix between the two random variables.
Return type: np.array