gensbi.diagnostics.marginal_coverage#
Functions#
|
Computes the marginal coverage (credibility) for each observation and dimension |
|
Computes marginal coverage using empirical histograms. |
|
Compute marginal coverage for each observation and dimension. |
|
Compute empirical coverage (hat{z}) vs nominal coverage (z) for one dimension. |
|
Plots the marginal coverage for all dimensions. |
Module Contents#
- gensbi.diagnostics.marginal_coverage._compute_marginal_coverage_KDE(theta, posterior_samples, grid_points=2048, bw='ISJ')[source]#
Computes the marginal coverage (credibility) for each observation and dimension using Kernel Density Estimation.
For each observation b and dimension d, estimates the 1D marginal posterior density p(x) via KDE, then computes the HPD (Highest Posterior Density) credibility:
\[lpha = \int_{p(x) > p( heta^*)} p(x) dx\]This gives the probability mass in regions denser than the true value. If the posterior is well calibrated, alpha should be Uniform(0, 1).
- Parameters:
theta (array_like of shape (N_batch, D_dim)) – The ground truth parameters.
posterior_samples (array_like of shape (N_samples, N_batch, D_dim)) – Samples from the posterior.
grid_points (int, optional) – Number of grid points for KDE evaluation. Default is 2048.
bw (str or float, optional) – Bandwidth method or value for KDE. Default is “ISJ” (Improved Sheather-Jones, a plug-in bandwidth selector).
- Returns:
alpha – The credibility values (0 to 1). If the posterior is well calibrated, these should be uniformly distributed.
- Return type:
np.ndarray of shape (D_dim, N_batch)
- gensbi.diagnostics.marginal_coverage._compute_marginal_coverage_histogram(theta, posterior_samples, bins='stone')[source]#
Computes marginal coverage using empirical histograms.
This is a discrete approximation of the HPD credibility: for each (observation, dimension) pair, it bins the posterior samples and sums the mass of all bins at least as dense as the one containing the true parameter. Faster than KDE and works well for multimodal distributions, but accuracy depends on bin count.
- Parameters:
theta (array_like of shape (N_batch, D_dim)) – The ground truth parameters.
posterior_samples (array_like of shape (N_samples, N_batch, D_dim)) – Samples from the posterior.
bins (int or str, optional) – Number of bins or binning strategy. See
numpy.histogram_bin_edgesfor valid string values (‘stone’, ‘scott’, ‘fd’, ‘sturges’, etc). Default is ‘stone’.
- Returns:
alpha_values – The credibility values (0 to 1).
- Return type:
np.ndarray of shape (D_dim, N_batch)
- gensbi.diagnostics.marginal_coverage.compute_marginal_coverage(theta, posterior_samples, method='histogram', **kwargs)[source]#
Compute marginal coverage for each observation and dimension.
- Parameters:
theta (array_like of shape (N_batch, D_dim)) – The ground truth parameters.
posterior_samples (array_like of shape (N_samples, N_batch, D_dim)) – Samples from the posterior.
method (str, optional) – Method to use for computing marginal coverage. Options are: - “histogram”: Use empirical histograms (default). - “KDE”: Use Kernel Density Estimation.
**kwargs (dict, optional) – Additional keyword arguments to pass to the chosen method.
- Returns:
alpha_values – The credibility values (0 to 1).
- Return type:
np.ndarray of shape (D_dim, N_batch)
- gensbi.diagnostics.marginal_coverage.estimate_hat_z(alpha_values, nbins=50, zmax=4, z_band=1)[source]#
Compute empirical coverage (hat{z}) vs nominal coverage (z) for one dimension.
Given a set of HPD credibility values (alpha), this function builds the empirical CDF of those values and converts it to z-score space. For a well-calibrated posterior, hat{z} should equal z along the diagonal.
- Parameters:
alpha_values (array_like) – 1D array of credibility values for a single dimension.
nbins (int, optional) – Number of bins for z (default is 50).
zmax (float, optional) – Maximum z value to evaluate (default is 4).
z_band (float, optional) – Z-score for the Jeffrey’s uncertainty band (default is 1, ~68% CI).
- Returns:
stats – Dictionary containing: - ‘z’: Nominal coverage z-values. - ‘mean’: Empirical coverage hat{z}. - ‘upper’: Upper bound of uncertainty band. - ‘lower’: Lower bound of uncertainty band.
- Return type:
dict
- gensbi.diagnostics.marginal_coverage.plot_marginal_coverage(alpha_values, zmax=3.5, n_cols=3, figsize=None)[source]#
Plots the marginal coverage for all dimensions.
- Parameters:
alpha_values (np.ndarray of shape (D_dim, N_batch)) – Credibility values.
zmax (float, optional) – Maximum z-score for plotting limits (default is 3.5).
n_cols (int, optional) – Number of columns in the subplot grid (default is 3).
figsize (tuple, optional) – Figure size (width, height). If None, calculated automatically.
- Returns:
fig – The figure object.
- Return type:
matplotlib.figure.Figure