geomstats.information_geometry package#

Submodules#

geomstats.information_geometry.base module#

Mixin for manifolds of probability distributions.

class geomstats.information_geometry.base.InformationManifoldMixin(support_shape, **kwargs)[source]#

Bases: object

Mixin for manifolds of probability distributions.

point_to_cdf(point)[source]#

Compute cdf associated to point.

Compute the cumulative density function of the probability distribution with parameters provided by point.

Parameters:

point (array-like, shape=[…, dim]) – Point representing a probability distribution.

Returns:

cdf (function) – Cumulative density function of the probability distribution with parameters provided by point.

point_to_pdf(point)[source]#

Compute pdf associated to point.

Compute the probability density function of the probability distribution with parameters provided by point.

Parameters:

point (array-like, shape=[…, dim]) – Point representing a probability distribution.

Returns:

pdf (function) – Probability density function of the probability distribution with parameters provided by point.

sample(point, n_samples=1)[source]#

Sample from the probability distribution.

Sample from the probability distribution with parameters provided by point. This gives n_samples points.

Parameters:
  • point (array-like, shape=[…, dim]) – Point representing a probability distribution.

  • n_samples (int) – Number of points to sample for each set of parameters in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples]) – Sample from the probability distributions.

class geomstats.information_geometry.base.ScipyMultivariateRandomVariable(space, scp_rvs, scp_pdf=None)[source]#

Bases: ScipyRandomVariable

A multivariate random variable.

pdf(x, point)[source]#

Evaluate the probability density function at x.

Parameters:
  • x (array-like, shape=[n_samples, *support_shape]) – Sample points at which to compute the probability density function.

  • point (array-like, shape=[…, *space.shape]) – Point representing a distribution.

Returns:

pdf_at_x (array-like, shape=[…, n_samples, *support_shape]) – Values of pdf at x for each value of the parameters provided by point.

rvs(point, n_samples=1)[source]#

Sample from a multivariate distribution.

Parameters:
  • point (array-like, shape=[…, *space.shape]) – Point representing a distribution.

  • n_samples (int) – Number of points to sample with each pair of parameters in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples, *support_shape]) – Sample from distribution.

class geomstats.information_geometry.base.ScipyRandomVariable(space, scp_rvs, scp_pdf=None)[source]#

Bases: object

A random variable.

class geomstats.information_geometry.base.ScipyUnivariateRandomVariable(space, scp_rvs, scp_pdf=None)[source]#

Bases: ScipyRandomVariable

A univariate random variable.

pdf(sample, point)[source]#

Evaluate the probability density function at x.

Parameters:
  • x (array-like, shape=[n_samples, *support_shape]) – Sample points at which to compute the probability density function.

  • point (array-like, shape=[…, *space.shape]) – Point representing a distribution.

Returns:

pdf_at_x (array-like, shape=[…, n_samples]) – Values of pdf at x for each value of the parameters provided by point.

rvs(point, n_samples=1)[source]#

Sample from a univariate distribution.

Parameters:
  • point (array-like, shape=[…, *space.shape]) – Point representing a univariate distribution.

  • n_samples (int) – Number of points to sample with each pair of parameters in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples, *support_shape]) – Sample from distribution.

geomstats.information_geometry.beta module#

Statistical Manifold of beta distributions with the Fisher metric.

Lead author: Alice Le Brigant.

class geomstats.information_geometry.beta.BetaDistributions(equip=True)[source]#

Bases: DirichletDistributions

Class for the manifold of beta distributions.

This is Beta = \(R_+^* \times R_+^*\), the upper-right quadrant of the Euclidean plane.

dim#

Dimension of the manifold of beta distributions, equal to 2.

Type:

int

embedding_space#

Embedding manifold.

Type:

Manifold

static default_metric()[source]#

Metric to equip the space with if equip is True.

static maximum_likelihood_fit(data, loc=0, scale=1, epsilon=1e-06)[source]#

Estimate parameters from samples.

This a wrapper around scipy’s maximum likelihood estimator to estimate the parameters of a beta distribution from samples.

Parameters:
  • data (array-like, shape=[…, n_samples]) – Data to estimate parameters from. Arrays of different length may be passed.

  • loc (float) – Location parameter of the distribution to estimate parameters from. It is kept fixed during optimization. Optional, default: 0.

  • scale (float) – Scale parameter of the distribution to estimate parameters from. It is kept fixed during optimization. Optional, default: 1.

Returns:

parameter (array-like, shape=[…, 2]) – Estimate of parameter obtained by maximum likelihood.

point_to_pdf(point)[source]#

Compute pdf associated to point.

Compute the probability density function of the beta distribution with parameters provided by point.

Parameters:

point (array-like, shape=[…, 2]) – Point representing a beta distribution.

Returns:

pdf (function) – Probability density function of the beta distribution with parameters provided by point.

sample(point, n_samples=1)[source]#

Sample from the beta distribution.

Sample from the beta distribution with parameters provided by point. This gives samples in the segment [0, 1].

Parameters:
  • point (array-like, shape=[…, 2]) – Point representing a beta distribution.

  • n_samples (int) – Number of points to sample with each pair of parameters in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples]) – Sample from beta distributions.)

class geomstats.information_geometry.beta.BetaDistributionsRandomVariable(space)[source]#

Bases: ScipyUnivariateRandomVariable

A beta random variable.

class geomstats.information_geometry.beta.BetaMetric(space)[source]#

Bases: DirichletMetric

Class for the Fisher information metric on beta distributions.

static metric_det(point)[source]#

Compute the determinant of the metric.

Parameters:

point (array-like, shape=[…, 2]) – Point representing a beta distribution.

Returns:

metric_det (array-like, shape=[…,]) – Determinant of the metric.

geomstats.information_geometry.binomial module#

Statistical Manifold of Binomial distributions with the Fisher metric.

Lead authors: Jules Deschamps, Tra My Nguyen.

class geomstats.information_geometry.binomial.BinomialDistributions(n_draws, equip=True)[source]#

Bases: InformationManifoldMixin, VectorSpaceOpenSet

Class for the manifold of binomial distributions.

This is the parameter space of binomial distributions i.e. the half-line of positive reals.

belongs(point, atol=1e-12)[source]#

Evaluate if a point belongs to the manifold of binomial distributions.

Parameters:
  • point (array-like, shape=[…, 1]) – Point to be checked.

  • atol (float) – Tolerance to evaluate if point belongs to (0,1). Optional, default: gs.atol

Returns:

belongs (array-like, shape=[…,]) – Boolean indicating whether point represents a binomial distribution.

static default_metric()[source]#

Metric to equip the space with if equip is True.

point_to_pdf(point)[source]#

Compute pmf associated to point.

Compute the probability mass function of the binomial distribution with parameters provided by point.

Parameters:

point (array-like, shape=[…, 1]) – Point representing a binomial distribution (probability of success).

Returns:

pmf (function) – Probability mass function of the binomial distribution with parameters provided by point.

projection(point, atol=1e-12)[source]#

Project a point in ambient space to the parameter set.

The parameter is floored to gs.atol if it is negative and to ‘1-gs.atol’ if it is greater than 1.

Parameters:
  • point (array-like, shape=[…, 1]) – Point in ambient space.

  • atol (float) – Tolerance to evaluate positivity.

Returns:

projected (array-like, shape=[…, 1]) – Projected point.

random_point(n_samples=1)[source]#

Sample parameters of binomial distributions.

The uniform distribution on [0, 1] is used.

Parameters:

n_samples (int) – Number of samples. Optional, default: 1.

Returns:

samples (array-like, shape=[…, dim]) – Sample of points representing binomial distributions.

sample(point, n_samples=1)[source]#

Sample from the binomial distribution.

Sample from the binomial distribution with parameter provided by point.

Parameters:
  • point (array-like, shape=[…, dim]) – Point representing a binomial distribution.

  • n_samples (int) – Number of points to sample with for each parameter in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples, dim]) – Sample from binomial distributions.

class geomstats.information_geometry.binomial.BinomialDistributionsRandomVariable(space)[source]#

Bases: ScipyUnivariateRandomVariable

A binomial random variable.

pdf(x, point)[source]#

Evaluate the probability density function at x.

Parameters:
  • x (array-like, shape=[n_samples, *support_shape]) – Sample points at which to compute the probability density function.

  • point (array-like, shape=[…, *space.shape]) – Point representing a distribution.

Returns:

pdf_at_x (array-like, shape=[…, n_samples, *support_shape]) – Values of pdf at x for each value of the parameters provided by point.

class geomstats.information_geometry.binomial.BinomialMetric(space, signature=None)[source]#

Bases: RiemannianMetric

Class for the Fisher information metric on binomial distributions.

References

[AM1981]

Atkinson, C., & Mitchell, A. F. (1981). Rao’s distance measure. Sankhyā: The Indian Journal of Statistics, Series A, 345-365.

exp(tangent_vec, base_point)[source]#

Compute exp map of a base point in tangent vector direction.

Parameters:
  • tangent_vec (array-like, shape=[…, 1]) – Tangent vector at base point.

  • base_point (array-like, shape=[…, 1]) – Base point.

Returns:

exp (array-like, shape=[…, 1]) – End point of the geodesic starting at base_point with initial velocity tangent_vec.

log(point, base_point)[source]#

Compute log map using a base point and an end point.

Parameters:
  • point (array-like, shape=[…, 1]) – End point.

  • base_point (array-like, shape=[…, 1]) – Base point.

Returns:

tangent_vec (array-like, shape=[…, 1]) – Initial velocity of the geodesic starting at base_point and reaching point at time 1.

metric_matrix(base_point)[source]#

Compute the metric matrix at the tangent space at base_point.

Parameters:

base_point (array-like, shape=[…, 1]) – Point representing a binomial distribution.

Returns:

mat (array-like, shape=[…, 1, 1]) – Metric matrix.

squared_dist(point_a, point_b)[source]#

Compute squared distance associated with the binomial Fisher Rao metric.

Parameters:
  • point_a (array-like, shape=[…, 1]) – Point representing a binomial distribution (probability of success).

  • point_b (array-like, shape=[…, 1]) – Point representing a binomial distribution (probability of success).

Returns:

squared_dist (array-like, shape=[…,]) – Squared distance between points point_a and point_b.

geomstats.information_geometry.categorical module#

Statistical Manifold of categorical distributions with the Fisher metric.

Lead author: Alice Le Brigant.

class geomstats.information_geometry.categorical.CategoricalDistributions(dim, equip=True)[source]#

Bases: MultinomialDistributions

Class for the manifold of categorical distributions.

This is the set of n+1-tuples of positive reals that sum up to one, i.e. the n-simplex. Each point is the parameter of a categorical distribution, i.e. gives the probabilities of $n$ different outcomes in a single experiment.

dim#

Dimension of the manifold of categorical distributions. The number of outcomes is dim + 1.

Type:

int

embedding_manifold#

Embedding manifold.

Type:

Manifold

static default_metric()[source]#

Metric to equip the space with if equip is True.

class geomstats.information_geometry.categorical.CategoricalMetric(space)[source]#

Bases: MultinomialMetric

Class for the Fisher information metric on categorical distributions.

The Fisher information metric on the $n$-simplex of categorical distributions parameters can be obtained as the pullback metric of the $n$-sphere using the componentwise square root.

References

[K2003]

R. E. Kass. The Geometry of Asymptotic Inference. Statistical Science, 4(3): 188 - 234, 1989.

geomstats.information_geometry.dirichlet module#

Statistical Manifold of Dirichlet distributions with the Fisher metric.

Lead author: Alice Le Brigant.

class geomstats.information_geometry.dirichlet.DirichletDistributions(dim, equip=True)[source]#

Bases: InformationManifoldMixin, VectorSpaceOpenSet

Class for the manifold of Dirichlet distributions.

This is Dirichlet = \((R_+^*)^dim\), the positive quadrant of the dim-dimensional Euclidean space.

dim#

Dimension of the manifold of Dirichlet distributions.

Type:

int

belongs(point, atol=1e-12)[source]#

Evaluate if a point belongs to the manifold of Dirichlet distributions.

Check that point defines parameters for a Dirichlet distributions, i.e. belongs to the positive quadrant of the Euclidean space.

Parameters:
  • point (array-like, shape=[…, dim]) – Point to be checked.

  • atol (float) – Tolerance to evaluate positivity. Optional, default: gs.atol

Returns:

belongs (array-like, shape=[…,]) – Boolean indicating whether point represents a Dirichlet distribution.

static default_metric()[source]#

Metric to equip the space with if equip is True.

point_to_pdf(point)[source]#

Compute pdf associated to point.

Compute the probability density function of the Dirichlet distribution with parameters provided by point.

Parameters:

point (array-like, shape=[…, dim]) – Point representing a beta distribution.

Returns:

pdf (function) – Probability density function of the Dirichlet distribution with parameters provided by point.

projection(point, atol=1e-12)[source]#

Project a point in ambient space to the open set.

The last coordinate is floored to gs.atol if it is negative.

Parameters:
  • point (array-like, shape=[…, dim]) – Point in ambient space.

  • atol (float) – Tolerance to evaluate positivity.

Returns:

projected (array-like, shape=[…, dim]) – Projected point.

random_point(n_samples=1, bound=5.0)[source]#

Sample parameters of Dirichlet distributions.

The uniform distribution on [0, bound]^dim is used.

Parameters:
  • n_samples (int) – Number of samples. Optional, default: 1.

  • bound (float) – Side of the square where the Dirichlet parameters are sampled. Optional, default: 5.

Returns:

samples (array-like, shape=[…, dim]) – Sample of points representing Dirichlet distributions.

sample(point, n_samples=1)[source]#

Sample from the Dirichlet distribution.

Sample from the Dirichlet distribution with parameters provided by point. This gives n_samples points in the simplex.

Parameters:
  • point (array-like, shape=[…, dim]) – Point representing a Dirichlet distribution.

  • n_samples (int) – Number of points to sample for each set of parameters in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples, dim]) – Sample from the Dirichlet distributions.

class geomstats.information_geometry.dirichlet.DirichletMetric(space)[source]#

Bases: RiemannianMetric

Class for the Fisher information metric on Dirichlet distributions.

christoffels(base_point)[source]#

Compute the Christoffel symbols.

Compute the Christoffel symbols of the Fisher information metric.

Parameters:

base_point (array-like, shape=[…, dim]) – Base point.

Returns:

christoffels (array-like, shape=[…, dim, dim, dim]) – Christoffel symbols, with the contravariant index on the first dimension. \(christoffels[..., i, j, k] = Gamma^i_{jk}\)

References

[LPP2021]

A. Le Brigant, S. C. Preston, S. Puechmorel. Fisher-Rao geometry of Dirichlet Distributions. Differential Geometry and its Applications, 74, 101702, 2021.

injectivity_radius(base_point=None)[source]#

Compute the radius of the injectivity domain.

This is is the supremum of radii r for which the exponential map is a diffeomorphism from the open ball of radius r centered at the base point onto its image. In the case of the hyperbolic space, it does not depend on the base point and is infinite everywhere, because of the negative curvature.

Parameters:

base_point (array-like, shape=[…, dim]) – Point on the manifold.

Returns:

radius (array-like, shape=[…,]) – Injectivity radius.

jacobian_christoffels(base_point)[source]#

Compute the Jacobian of the Christoffel symbols.

Compute the Jacobian of the Christoffel symbols of the Fisher information metric.

Parameters:

base_point (array-like, shape=[…, dim]) – Base point.

Returns:

jac (array-like, shape=[…, dim, dim, dim, dim]) – Jacobian of the Christoffel symbols. \(jac[..., i, j, k, l] = dGamma^i_{jk} / dx_l\)

metric_matrix(base_point)[source]#

Compute the inner-product matrix.

Compute the inner-product matrix of the Fisher information metric at the tangent space at base point.

Parameters:

base_point (array-like, shape=[…, dim]) – Base point.

Returns:

mat (array-like, shape=[…, dim, dim]) – Inner-product matrix.

class geomstats.information_geometry.dirichlet.DirichletRandomVariable(space)[source]#

Bases: ScipyMultivariateRandomVariable

A Dirichlet random variable.

geomstats.information_geometry.exponential module#

Statistical Manifold of Binomial distributions with the Fisher metric.

Lead authors: Jules Deschamps, Tra My Nguyen.

class geomstats.information_geometry.exponential.ExponentialDistributions(equip=True)[source]#

Bases: InformationManifoldMixin, VectorSpaceOpenSet

Class for the manifold of exponential distributions.

This is the parameter space of exponential distributions i.e. the half-line of positive reals.

belongs(point, atol=1e-12)[source]#

Evaluate if a point belongs to the manifold of exponential distributions.

Parameters:
  • point (array-like, shape=[…, 1]) – Point to be checked.

  • atol (float) – Tolerance to evaluate positivity. Optional, default: gs.atol

Returns:

belongs (array-like, shape=[…,]) – Boolean indicating whether point represents an exponential distribution.

static default_metric()[source]#

Metric to equip the space with if equip is True.

point_to_pdf(point)[source]#

Compute pdf associated to point.

Compute the probability density function of the exponential distribution with parameters provided by point.

Parameters:

point (array-like, shape=[…, 1]) – Point representing an exponential distribution (scale).

Returns:

pdf (function) – Probability density function of the exponential distribution with scale parameter provided by point.

projection(point, atol=1e-12)[source]#

Project a point in ambient space to the open set.

The last coordinate is floored to gs.atol if it is non-positive.

Parameters:
  • point (array-like, shape=[…, 1]) – Point in ambient space.

  • atol (float) – Tolerance to evaluate positivity.

Returns:

projected (array-like, shape=[…, 1]) – Projected point.

random_point(n_samples=1, lower_bound=0.1, upper_bound=1.0)[source]#

Sample parameters of exponential distributions.

The uniform distribution on [0, bound] is used.

Parameters:
  • n_samples (int) – Number of samples. Optional, default: 1.

  • bound (float) – Right-end ot the segment where exponential parameters are sampled. Optional, default: 1.

Returns:

samples (array-like, shape=[n_samples,]) – Sample of points representing exponential distributions.

sample(point, n_samples=1)[source]#

Sample from the exponential distribution.

Sample from the exponential distribution with parameter provided by point.

Parameters:
  • point (array-like, shape=[…,]) – Point representing an exponential distribution.

  • n_samples (int) – Number of points to sample with each parameter in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples]) – Sample from exponential distributions.

class geomstats.information_geometry.exponential.ExponentialDistributionsRandomVariable(space)[source]#

Bases: ScipyUnivariateRandomVariable

An exponential random variable.

class geomstats.information_geometry.exponential.ExponentialMetric(space, signature=None)[source]#

Bases: RiemannianMetric

Class for the Fisher information metric on exponential distributions.

References

[AM1981]

Atkinson, C., & Mitchell, A. F. (1981). Rao’s distance measure. Sankhyā: The Indian Journal of Statistics, Series A, 345-365.

exp(tangent_vec, base_point)[source]#

Compute exp map of a base point in tangent vector direction.

Parameters:
  • tangent_vec (array-like, shape=[…, 1]) – Tangent vector at base point.

  • base_point (array-like, shape=[…, 1]) – Base point.

Returns:

exp (array-like, shape=[…, 1]) – End point of the geodesic starting at base_point with initial velocity tangent_vec.

log(point, base_point)[source]#

Compute log map using a base point and an end point.

Parameters:
  • point (array-like, shape=[…, 1]) – End point.

  • base_point (array-like, shape=[…, 1]) – Base point.

Returns:

tangent_vec (array-like, shape=[…, 1]) – Initial velocity of the geodesic starting at base_point and reaching point at time 1.

metric_matrix(base_point)[source]#

Compute the metric matrix at the tangent space at base_point.

Parameters:

base_point (array-like, shape=[…, 1]) – Point representing a binomial distribution.

Returns:

mat (array-like, shape=[…, 1, 1]) – Metric matrix.

squared_dist(point_a, point_b)[source]#

Compute squared distance associated with the exponential Fisher Rao metric.

Parameters:
  • point_a (array-like, shape=[…, 1]) – Point representing an exponential distribution (scale parameter).

  • point_b (array-like, shape=[…, 1]) – Point representing a exponential distribution (scale parameter).

Returns:

squared_dist (array-like, shape=[…,]) – Squared distance between points point_a and point_b.

geomstats.information_geometry.fisher_rao_metric module#

Class to implement simply the Fisher-Rao metric on information manifolds.

class geomstats.information_geometry.fisher_rao_metric.FisherRaoMetric(space, support)[source]#

Bases: RiemannianMetric

Class to derive the information metric from the pdf in InformationManifoldMixin.

Given a statistical manifold with coordinates \(\theta\), the Fisher information metric is:

\[g_{j k}(\theta)=\int_X \frac{\partial \log p(x, \theta)} {\partial \theta_j}\frac{\partial \log p(x, \theta)} {\partial \theta_k} p(x, \theta) d x\]
space#

Riemannian Manifold for a parametric family of (real) distributions.

Type:

InformationManifold

support#

Left and right bounds for the support of the distribution. But this is just to help integration, bounds should be as large as needed.

Type:

list, shape = (2,)

inner_product_derivative_matrix(base_point)[source]#

Compute the derivative of the inner-product matrix.

Compute the derivative of the inner-product matrix of the Fisher information metric at the tangent space at base point.

\[\partial_{\theta} I(\theta) = \partial_{\theta} \mathbb{E}[YY^T]\]

where,

\[Y = \nabla_{\theta} \log f_{\theta}(x)\]

or, in indicial notation:

\[\partial_k I_{ij} = \int\ \partial_{ki}^2 f\partial_j f \frac{1}{f} + \ \partial_{kj}^2 f\partial_i f \frac{1}{f} - \ \partial_i f \partial_j f \partial_k f \frac{1}{f^2}\]

with \(f = f_{\theta}(x)\)

Parameters:

base_point (array-like, shape=[…, dim]) – Base point.

Returns:

mat (array-like, shape=[…, dim, dim, dim]) – Derivative of the inner-product matrix, where the index k of the derivation is last: math:mat_{ijk} = partial_k g_{ij}.

metric_matrix(base_point)[source]#

Compute the inner-product matrix.

The Fisher information matrix is noted I in the literature.

Compute the inner-product matrix of the Fisher information metric at the tangent space at base point.

\[I(\theta) = \mathbb{E}[YY^T]\]

where,

\[Y = \nabla_{\theta} \log f_{\theta}(x)\]

After manipulation and in indicial notation

\[I_{ij} = \int \ \partial_{i} f_{\theta}(x)\ \partial_{j} f_{\theta}(x)\ \frac{1}{f_{\theta}(x)}\]
Parameters:

base_point (array-like, shape=[…, dim]) – Base point.

Returns:

metric_mat (array-like, shape=[…, dim, dim]) – Inner-product matrix.

References

[AS1985]

Amari, S (1985) Differential Geometric Methods in Statistics, Berlin, Springer – Verlag.

geomstats.information_geometry.gamma module#

Statistical Manifold of Gamma distributions with the Fisher metric.

The natural coordinate system for a Gamma Distribution is: point = [kappa, nu], where kappa is the shape parameter, and nu the rate, or 1/scale.

However, information geometry most often works with standard coordinates, given by: point = [kappa, gamma] = [kappa, kappa/nu].

The standard coordinate system is the convention we use in this script. All points and all vectors input are assumed to be given in the standard coordinate system unless stated otherwise.

Some of the methods in GammaDistributions allow to easily make the associated change of variable, either for a point or a vector.

Lead author: Jules Deschamps.

class geomstats.information_geometry.gamma.GammaDistributions(equip=True)[source]#

Bases: InformationManifoldMixin, VectorSpaceOpenSet

Class for the manifold of Gamma distributions.

This is \(Gamma = (R_+^*)^2\), the positive quadrant of the 2-dimensional Euclidean space.

belongs(point, atol=1e-12)[source]#

Evaluate if a point belongs to the manifold of Gamma distributions.

Check that point defines parameters for a Gamma distribution, i.e. belongs to the positive quadrant of the Euclidean space.

Parameters:
  • point (array-like, shape=[…, 2]) – Point to be checked.

  • atol (float) – Tolerance to evaluate positivity. Optional, default: gs.atol

Returns:

belongs (array-like, shape=[…,]) – Boolean indicating whether point represents a Gamma distribution.

static default_metric()[source]#

Metric to equip the space with if equip is True.

static maximum_likelihood_fit(data)[source]#

Estimate parameters from samples.

This is a wrapper around scipy’s maximum likelihood estimator to estimate the parameters of a gamma distribution from samples.

Parameters:

data (list or list of lists/arrays) – Data to estimate parameters from. Lists of different length may be passed.

Returns:

parameter (array-like, shape=[…, 2]) – Estimate of parameter obtained by maximum likelihood.

point_to_pdf(point)[source]#

Compute pdf associated to point.

Compute the probability density function of the Gamma distribution with parameters provided by point.

Parameters:

point (array-like, shape=[…, dim]) – Point representing a Gamma distribution.

Returns:

pdf (function) – Probability density function of the Gamma distribution with parameters provided by point.

projection(point, atol=1e-12)[source]#

Project a point in ambient space to the open set.

The last coordinate is floored to gs.atol if it is negative.

Parameters:
  • point (array-like, shape=[…, 2]) – Point in ambient space.

  • atol (float) – Tolerance to evaluate positivity.

Returns:

projected (array-like, shape=[…, 2]) – Projected point.

random_point(n_samples=1, upper_bound=5.0, lower_bound=0.0)[source]#

Sample parameters of Gamma distributions.

The uniform distribution on [0, bound]^2 is used.

Parameters:
  • n_samples (int) – Number of samples. Optional, default: 1.

  • bound (float) – Side of the square where the Gamma parameters are sampled. Optional, default: 5.

Returns:

samples (array-like, shape=[…, 2]) – Sample of points representing Gamma distributions.

sample(point, n_samples=1)[source]#

Sample from the Gamma distribution.

Sample from the Gamma distribution with parameters provided by point. This gives n_samples points.

Parameters:
  • point (array-like, shape=[…, dim]) – Point representing a Gamma distribution.

  • n_samples (int) – Number of points to sample for each set of parameters in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples]) – Sample from the Gamma distributions.

class geomstats.information_geometry.gamma.GammaDistributionsRandomVariable(space)[source]#

Bases: ScipyUnivariateRandomVariable

A gamma random variable.

class geomstats.information_geometry.gamma.GammaMetric(space)[source]#

Bases: RiemannianMetric

Class for the Fisher information metric on Gamma distributions.

References

[AD2008]

Arwini, K. A., & Dodson, C. T. (2008). Information geometry (pp. 31-54). Springer Berlin Heidelberg.

christoffels(base_point)[source]#

Compute the Christoffel symbols.

Compute the Christoffel symbols of the Fisher information metric. For computation purposes, we replace the value of (gs.polygamma(1, x) - 1/x) by an equivalent (close lower-bound) when it becomes too difficult to compute, as per in [GQ2015].

Parameters:

base_point (array-like, shape=[…, 2]) – Base point.

Returns:

christoffels (array-like, shape=[…, 2, 2, 2]) – Christoffel symbols, with the contravariant index on the first dimension. \(christoffels[..., i, j, k] = Gamma^i_{jk}\)

References

[AD2008]

Arwini, K. A., & Dodson, C. T. (2008). Information geometry (pp. 31-54). Springer Berlin Heidelberg.

[GQ2015]

Guo, B. N., Qi, F., Zhao, J. L., & Luo, Q. M. (2015). Sharp inequalities for polygamma functions. Mathematica Slovaca, 65(1), 103-120.

jacobian_christoffels(base_point)[source]#

Compute the Jacobian of the Christoffel symbols.

Compute the Jacobian of the Christoffel symbols of the Fisher information metric.

For computation purposes, we replace the value of (gs.polygamma(1, x) - 1/x) and (gs.polygamma(2,x) + 1/x**2) by an equivalent (close bounds) when they become too difficult to compute.

Parameters:

base_point (array-like, shape=[…, 2]) – Base point.

Returns:

jac (array-like, shape=[…, 2, 2, 2, 2]) – Jacobian of the Christoffel symbols. \(jac[..., i, j, k, l] = dGamma^i_{jk} / dx_l\)

References

[GQ2015]

Guo, B. N., Qi, F., Zhao, J. L., & Luo, Q. M. (2015). Sharp inequalities for polygamma functions. Mathematica Slovaca, 65(1), 103-120.

metric_matrix(base_point)[source]#

Compute the inner-product matrix.

Compute the inner-product matrix of the Fisher information metric at the tangent space at base point.

Parameters:

base_point (array-like, shape=[…, 2]) – Base point.

Returns:

mat (array-like, shape=[…, 2, 2]) – Inner-product matrix.

class geomstats.information_geometry.gamma.NaturalToStandardDiffeo[source]#

Bases: InvolutionDiffeomorphism

Diffeomorphism between natural and standard coordinates.

diffeomorphism(point)[source]#

Convert point from natural coordinates to standard coordinates.

The change of variable is symmetric.

Parameters:

point (array-like, shape=[…, 2]) – Point of the Gamma manifold, given in natural coordinates.

Returns:

point (array-like, shape=[…, 2]) – Point of the Gamma manifold, given in standard coordinates.

tangent_diffeomorphism(tangent_vec, base_point=None, image_point=None)[source]#

Convert tangent vector from natural coordinates to standard coordinates.

The change of variable is symmetric.

Parameters:
  • tangent_vec (array-like, shape=[…, 2]) – Tangent vector at base_point, given in natural coordinates.

  • base_point (array-like, shape=[…, 2]) – Point of the Gamma manifold, given in natural coordinates.

  • image_point (array-like, shape=[…, 2]) – Point of the Gamma manifold, given in standard coordinates.

Returns:

image_tangent_vec (array-like, shape=[…, 2]) – Tangent vector at base_point, given in standard coordinates.

geomstats.information_geometry.geometric module#

Statistical Manifold of geometric distributions with the Fisher metric.

Lead author: Tra My Nguyen.

class geomstats.information_geometry.geometric.GeometricDistributions(equip=True)[source]#

Bases: InformationManifoldMixin, VectorSpaceOpenSet

Class for the manifold of geometric distributions.

This is the parameter space of geometric distributions i.e. [0,1] segment.

belongs(point, atol=1e-12)[source]#

Evaluate if a point belongs to the manifold of geometric distributions.

Parameters:
  • point (array-like, shape=[…, 1]) – Point to be checked.

  • atol (float) – Tolerance to evaluate positivity. Optional, default: gs.atol

Returns:

belongs (array-like, shape=[…,]) – Boolean indicating whether point represents an geometric distribution.

static default_metric()[source]#

Metric to equip the space with if equip is True.

point_to_pdf(point)[source]#

Compute pmf associated to point.

Compute the probability mass function of the geometric distribution with parameters provided by point.

Parameters:

point (array-like, shape=[…, 1]) – Point representing an geometric distribution (scale).

Returns:

pdf (function) – Probability mass function of the geometric distribution with scale parameter provided by point.

projection(point, atol=1e-12)[source]#

Project a point in ambient space to the open set.

Return a new point belonging to [0,1] segment within the given tolerance.

Parameters:
  • point (array-like, shape=[…, 1]) – Point in ambient space.

  • atol (float) – Tolerance to evaluate the position with respect to [0,1] interval.

Returns:

projected (array-like, shape=[…, 1]) – Projected point.

random_point(n_samples=1)[source]#

Sample parameters of geometric distributions.

The uniform distribution on (0, 1) is used.

Parameters:
  • n_samples (int) – Number of samples. Optional, default: 1.

  • bound (float) – Right-end ot the segment where geometric parameters are sampled. Optional, default: 1.

Returns:

samples (array-like, shape=[n_samples,]) – Sample of points representing geometric distributions.

sample(point, n_samples=1)[source]#

Sample from the geometric distribution.

Sample from the geometric distribution with parameter provided by point.

Parameters:
  • point (array-like, shape=[…, 1]) – Point representing an geometric distribution.

  • n_samples (int) – Number of points to sample with each parameter in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples]) – Sample from geometric distributions.

class geomstats.information_geometry.geometric.GeometricDistributionsRandomVariable(space)[source]#

Bases: ScipyUnivariateRandomVariable

A geometric random variable.

pdf(x, point)[source]#

Evaluate the probability density function at x.

Parameters:
  • x (array-like, shape=[n_samples, *support_shape]) – Sample points at which to compute the probability density function.

  • point (array-like, shape=[…, *space.shape]) – Point representing a distribution.

Returns:

pdf_at_x (array-like, shape=[…, n_samples, *support_shape]) – Values of pdf at x for each value of the parameters provided by point.

class geomstats.information_geometry.geometric.GeometricMetric(space, signature=None)[source]#

Bases: RiemannianMetric

Class for the Fisher information metric on geometric distributions.

References

[AM1981]

Atkinson, C., & Mitchell, A. F. (1981). Rao’s distance measure. Sankhyā: The Indian Journal of Statistics, Series A, 345-365.

exp(tangent_vec, base_point)[source]#

Compute exp map of a base point in tangent vector direction.

Parameters:
  • tangent_vec (array-like, shape=[…, 1]) – Tangent vector at base point.

  • base_point (array-like, shape=[…, 1]) – Base point.

Returns:

exp (array-like, shape=[…, 1]) – End point of the geodesic starting at base_point with initial velocity tangent_vec.

log(point, base_point)[source]#

Compute log map using a base point and an end point.

Parameters:
  • point (array-like, shape=[…, 1]) – End point.

  • base_point (array-like, shape=[…, 1]) – Base point.

Returns:

tangent_vec (array-like, shape=[…, 1]) – Initial velocity of the geodesic starting at base_point and reaching point at time 1.

metric_matrix(base_point)[source]#

Compute the metric matrix at base_point.

Parameters:

base_point (array-like, shape=[…, 1]) – Point representing a geometric distribution.

Returns:

mat (array-like, shape=[…, 1, 1]) – Metric matrix.

squared_dist(point_a, point_b)[source]#

Compute squared distance associated with the geometric metric.

Parameters:
  • point_a (array-like, shape=[…, 1]) – Point representing an geometric distribution.

  • point_b (array-like, shape=[…, 1]) – Point representing a geometric distribution.

Returns:

squared_dist (array-like, shape=[…,]) – Squared distance between points point_a and point_b.

geomstats.information_geometry.multinomial module#

Statistical Manifold of multinomial distributions with the Fisher metric.

Lead author: Alice Le Brigant.

class geomstats.information_geometry.multinomial.MultinomialDistributions(dim, n_draws, equip=True)[source]#

Bases: InformationManifoldMixin, LevelSet

Class for the manifold of multinomial distributions.

This is the set of n+1-tuples of positive reals that sum up to one, i.e. the n-simplex. Each point is the parameter of a multinomial distribution, i.e. gives the probabilities of $n$ different outcomes in a single experiment.

dim#

Dimension of the parameter manifold of multinomial distributions. The number of outcomes is dim + 1.

Type:

int

embedding_manifold#

Embedding manifold.

Type:

Manifold

static default_metric()[source]#

Metric to equip the space with if equip is True.

point_to_pdf(point)[source]#

Compute pdf associated to point.

Compute the probability density function of the Multinomial distribution with parameters provided by point.

Parameters:

point (array-like, shape=[…, dim]) – Point representing a beta distribution.

Returns:

pdf (function) – (Discrete) probability density function.

projection(point, atol=1e-12)[source]#

Project a point on the simplex.

Negative components are replaced by zero and the point is renormalized by its 1-norm.

Parameters:
  • point (array-like, shape=[…, dim + 1]) – Point in embedding Euclidean space.

  • atol (float) – Tolerance to evaluate positivity.

Returns:

projected_point (array-like, shape=[…, dim + 1]) – Point projected on the simplex.

random_point(n_samples=1)[source]#

Generate parameters of multinomial distributions.

The Dirichlet distribution on the simplex is used to generate parameters.

Parameters:

n_samples (int) – Number of samples. Optional, default: 1.

Returns:

samples (array-like, shape=[…, dim + 1]) – Sample of points representing multinomial distributions.

sample(point, n_samples=1)[source]#

Sample from the multinomial distribution.

Sample from the multinomial distribution with parameters provided by point. This gives samples in the simplex.

Parameters:
  • point (array-like, shape=[…, dim + 1]) – Parameters of a multinomial distribution, i.e. probabilities associated to dim + 1 outcomes.

  • n_samples (int) – Number of points to sample with each set of parameters in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples, dim + 1]) – Samples from multinomial distributions. Note that this can be of shape [n_points, n_samples, dim + 1] if several points and several samples are provided as inputs.

submersion(point)[source]#

Submersion that defines the manifold.

Parameters:

point (array-like, shape=[…, dim + 1])

Returns:

submersed_point (array-like, shape=[…])

tangent_submersion(vector, point)[source]#

Tangent submersion.

Parameters:
  • vector (array-like, shape=[…, dim + 1])

  • point (Ignored.)

Returns:

submersed_vector (array-like, shape=[…])

to_tangent(vector, base_point=None)[source]#

Project a vector to the tangent space.

Project a vector in Euclidean space on the tangent space of the simplex at a base point.

Parameters:
  • vector (array-like, shape=[…, dim + 1]) – Vector in Euclidean space.

  • base_point (array-like, shape=[…, dim + 1]) – Point on the simplex defining the tangent space, where the vector will be projected.

Returns:

vector (array-like, shape=[…, dim + 1]) – Tangent vector in the tangent space of the simplex at the base point.

class geomstats.information_geometry.multinomial.MultinomialMetric(space)[source]#

Bases: RiemannianMetric

Class for the Fisher information metric on multinomial distributions.

The Fisher information metric on the $n$-simplex of multinomial distributions parameters can be obtained as the pullback metric of the $n$-sphere using the componentwise square root.

References

[K2003]

R. E. Kass. The Geometry of Asymptotic Inference. Statistical Science, 4(3): 188 - 234, 1989.

exp(tangent_vec, base_point)[source]#

Compute the exponential map.

Compute the exponential map associated to the Fisher information metric by pulling back the exponential map on the sphere by the simplex_to_sphere map.

Parameters:
  • tangent_vec (array-like, shape=[…, dim + 1]) – Tangent vector at base point.

  • base_point (array-like, shape=[…, dim + 1]) – Base point.

Returns:

exp (array-like, shape=[…, dim + 1]) – End point of the geodesic starting at base_point with initial velocity tangent_vec and stopping at time 1.

geodesic(initial_point, end_point=None, initial_tangent_vec=None)[source]#

Generate parameterized function for the geodesic curve.

Geodesic curve defined by either:

  • an initial point and an initial tangent vector,

  • an initial point and an end point.

Parameters:
  • initial_point (array-like, shape=[…, dim + 1]) – Point on the manifold, initial point of the geodesic.

  • end_point (array-like, shape=[…, dim + 1]) – Point on the manifold, end point of the geodesic. Optional, default: None. If None, an initial tangent vector must be given.

  • initial_tangent_vec (array-like, shape=[…, dim + 1]) – Tangent vector at base point, the initial speed of the geodesics. Optional, default: None. If None, an end point must be given and a logarithm is computed.

Returns:

path (callable) – Time parameterized geodesic curve. If a batch of initial conditions is passed, the output array’s first dimension represents time, and the second corresponds to the different initial conditions.

log(point, base_point)[source]#

Compute the logarithm map.

Compute logarithm map associated to the Fisher information metric by pulling back the exponential map on the sphere by the simplex_to_sphere map.

Parameters:
  • point (array-like, shape=[…, dim + 1]) – Point.

  • base_point (array-like, shape=[…, dim + 1]) – Base po int.

Returns:

tangent_vec (array-like, shape=[…, dim + 1]) – Initial velocity of the geodesic starting at base_point and reaching point at time 1.

metric_matrix(base_point)[source]#

Compute the inner-product matrix.

Compute the inner-product matrix of the Fisher information metric at the tangent space at base point.

Parameters:

base_point (array-like, shape=[…, dim + 1]) – Base point.

Returns:

mat (array-like, shape=[…, dim, dim]) – Inner-product matrix.

sectional_curvature(tangent_vec_a, tangent_vec_b, base_point=None)[source]#

Compute the sectional curvature.

In the literature sectional curvature is noted K.

For two orthonormal tangent vectors \(x,y\) at a base point, the sectional curvature is defined by \(K(x,y) = <R(x, y)x, y>\).

For non-orthonormal vectors, it is \(K(x,y) = <R(x, y)y, x> / (<x, x><y, y> - <x, y>^2)\).

sectional_curvature(X, Y, P) = K(X,Y) where X, Y are tangent vectors at base point P.

The information manifold of multinomial distributions has constant sectional curvature given by \(K = 2 \sqrt{n}\).

Parameters:
  • tangent_vec_a (array-like, shape=[…, dim + 1]) – Tangent vector at base_point.

  • tangent_vec_b (array-like, shape=[…, dim + 1]) – Tangent vector at base_point.

  • base_point (array-like, shape=[…, dim + 1]) – Point in the manifold.

Returns:

sectional_curvature (array-like, shape=[…,]) – Sectional curvature at base_point.

static simplex_to_sphere(point)[source]#

Send point of the simplex to the sphere.

The map takes the square root of each component.

Parameters:

point (array-like, shape=[…, dim + 1]) – Point on the simplex.

Returns:

point_sphere (array-like, shape=[…, dim + 1]) – Point on the sphere.

static sphere_to_simplex(point)[source]#

Send point of the sphere to the simplex.

The map squares each component.

Parameters:

point (array-like, shape=[…, dim + 1]) – Point on the sphere.

Returns:

point_simplex (array-like, shape=[…, dim + 1]) – Point on the simplex.

tangent_simplex_to_sphere(tangent_vec, base_point)[source]#

Send tangent vector of the simplex to tangent space of sphere.

This is the differential of the simplex_to_sphere map.

Parameters:
  • tangent_vec (array-like, shape=[…, dim + 1]) – Tangent vec to the simplex at base point.

  • base_point (array-like, shape=[…, dim + 1]) – Point of the simplex.

Returns:

tangent_vec_sphere (array-like, shape=[…, dim + 1]) – Tangent vec to the sphere at the image of base point by simplex_to_sphere.

static tangent_sphere_to_simplex(tangent_vec, base_point)[source]#

Send tangent vector of the sphere to tangent space of simplex.

This is the differential of the sphere_to_simplex map.

Parameters:
  • tangent_vec (array-like, shape=[…, dim + 1]) – Tangent vec to the sphere at base point.

  • base_point (array-like, shape=[…, dim + 1]) – Point of the sphere.

Returns:

tangent_vec_simplex (array-like, shape=[…, dim + 1]) – Tangent vec to the simplex at the image of base point by sphere_to_simplex.

class geomstats.information_geometry.multinomial.MultinomialRandomVariable(space)[source]#

Bases: ScipyMultivariateRandomVariable

A multinomial random variable.

geomstats.information_geometry.normal module#

Information Manifold of multivariate normal distributions with the Fisher metric.

Lead authors: Antoine Collas, Alice Le Brigant.

class geomstats.information_geometry.normal.CenteredNormalDistributions(sample_dim, equip=True)[source]#

Bases: InformationManifoldMixin, SPDMatrices

Class for the manifold of centered multivariate normal distributions.

This is the class for multivariate normal distributions with zero mean. Each distribution is represented by its covariance matrix, i.e. a symmetric positive-definite matrix of size \(sample\_dim\).

Parameters:

sample_dim (int) – Dimension of the sample space of the multivariate normal distribution.

static default_metric()[source]#

Metric to equip the space with if equip is True.

point_to_pdf(point)[source]#

Compute pdf associated to point.

Parameters:

point (array-like, shape=[…, sample_dim, sample_dim]) – Symmetric positive definite matrix representing the covariance matrix of a multivariate normal distribution with zero mean.

Returns:

pdf (function) – Probability density function of the centered multivariate normal distributions with covariance matrices provided by point.

sample(point, n_samples=1)[source]#

Sample from a centered multivariate normal distribution.

Parameters:
  • point (array-like, shape=[…, sample_dim, sample_dim]) – Symmetric positive definite matrix representing the covariance matrix of a multivariate normal distribution with zero mean.

  • n_samples (int) – Number of points to sample with each covariance matrix in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples, sample_dim]) – Sample from centered multivariate normal distributions.

class geomstats.information_geometry.normal.CenteredNormalMetric(space)[source]#

Bases: object

Class for the Fisher information metric of centered normal distributions.

class geomstats.information_geometry.normal.DiagonalNormalDistributions(sample_dim, equip=True)[source]#

Bases: InformationManifoldMixin, VectorSpaceOpenSet

Class for the manifold of diagonal multivariate normal distributions.

This is the class for multivariate normal distributions with diagonal covariance matrices. Each distribution is represented by a vector of size \(2 * sample\_dim\) where the first \(sample\_dim\) elements contain the mean vector and the \(sample\_dim\) last elements contain the diagonal of the covariance matrix.

Parameters:

sample_dim (int) – Dimension of the sample space of the multivariate normal distribution.

belongs(point, atol=1e-12)[source]#

Evaluate if the point belongs to the manifold.

Parameters:

point (array-like, shape=[…, 2 * sample_dim]) – Point to test. First \(sample\_dim\) elements contain the mean vector and the last \(sample\_dim\) elements contain the diagonal of the covariance matrix.

Returns:

belongs (array-like, shape=[…,]) – Boolean evaluating if point belongs to the space.

static default_metric()[source]#

Metric to equip the space with if equip is True.

point_to_pdf(point)[source]#

Compute pdf associated to point.

Parameters:

point (array-like, shape=[…, 2 * sample_dim]) – Point representing a probability distribution. First \(sample\_dim\) elements contain the mean vector and the last \(sample\_dim\) elements contain the diagonal of the covariance matrix.

Returns:

pdf (function) – Probability density function of the normal distribution with parameters provided by point.

projection(point)[source]#

Project a point on the manifold.

Floor the eigenvalues of the diagonal covariance matrix to gs.atol.

Parameters:

point (array-like, shape=[…, 2 * sample_dim]) – Point to project. First \(sample\_dim\) elements contain the mean vector and the last \(sample\_dim\) elements contain the diagonal of the covariance matrix.

Returns:

projected (array-like, shape=[…, 2 * sample_dim]) – Point containing means and diagonals of covariance matrices.

random_point(n_samples=1)[source]#

Generate random parameters of multivariate diagonal normal distributions.

Parameters:

n_samples (int) – Number of samples. Optional, default: 1.

Returns:

samples (array-like, shape=[…, 2 * sample_dim]) – Sample of points representing multivariate diagonal normal distributions. First \(sample\_dim\) elements contain the mean vector and the last \(sample\_dim\) elements contain the diagonal of the covariance matrix.

sample(point, n_samples=1)[source]#

Sample from the diagonal multivariate normal distribution.

Parameters:
  • point (array-like, shape=[…, 2 * sample_dim]) – Point on the manifold. First \(sample\_dim\) elements contain the mean vector and the last \(sample\_dim\) elements contain the diagonal of the covariance matrix.

  • n_samples (int) – Number of points to sample with each pair of parameters in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples, sample_dim]) – Sample from multivariate normal distributions.

class geomstats.information_geometry.normal.DiagonalNormalDistributionsRandomVariable(space)[source]#

Bases: ScipyMultivariateRandomVariable

A diagonal multivariate normal random variable.

class geomstats.information_geometry.normal.DiagonalNormalMetric(space)[source]#

Bases: RiemannianMetric

Class for the Fisher information metric of diagonal normal distributions.

exp(tangent_vec, base_point)[source]#

Compute the Riemannian exponential.

Parameters:
  • tangent_vec (array-like, shape=[…, 2 * sample_dim]) – Tangent vector at the base point.

  • base_point (array-like, shape=[…, 2 * sample_dim]) – Point.

Returns:

end_point (array-like, shape=[…, 2 * sample_dim]) – Point reached by the geodesic starting from base_point with initial velocity tangent_vec.

injectivity_radius(base_point=None)[source]#

Compute the radius of the injectivity domain.

Parameters:

base_point (array-like, shape=[…, 2 * sample_dim]) – Point on the manifold.

Returns:

radius (array-like, shape=[…,]) – Injectivity radius.

inner_product(tangent_vec_a, tangent_vec_b, base_point)[source]#

Inner product between two tangent vectors at a base point.

Parameters:
  • tangent_vec_a (array-like, shape=[…, 2 * sample_dim]) – Tangent vector at base point.

  • tangent_vec_b (array-like, shape=[…, 2 * sample_dim]) – Tangent vector at base point.

  • base_point (array-like, shape=[…, 2 * sample_dim]) – Base point. Optional, default: None.

Returns:

inner_product (array-like, shape=[…,]) – Inner-product.

log(point, base_point)[source]#

Compute Riemannian logarithm of a point wrt a base point.

Parameters:
  • point (array-like, shape=[…, 2 * sample_dim]) – Point.

  • base_point (array-like, shape=[…, 2 * sample_dim]) – Point.

Returns:

log (array-like, shape=[…, 2 * sample_dim]) – Tangent vector at the base point equal to the Riemannian logarithm of point at the base point.

class geomstats.information_geometry.normal.GeneralNormalDistributions(sample_dim, equip=True)[source]#

Bases: InformationManifoldMixin, ProductManifold

Class for the manifold of multivariate normal distributions.

This is the class for multivariate normal distributions on the Euclidean space. Each distribution is represented by the concatenation of its mean vector and its covariance matrix reshaped in a vector of size \(sample\_dim ** 2\).

Parameters:

sample_dim (int) – Dimension of the sample space of the multivariate normal distribution.

point_to_pdf(point)[source]#

Compute pdf associated to point.

Parameters:

point (array-like, shape=[…, sample_dim + sample_dim * 2]*) – Point representing a multivariate normal distribution. First \(sample\_dim\) elements contain the mean vector and the last \(sample\_dim ** 2\) elements contain the covariance matrix row by row.

Returns:

pdf (function) – Probability density function of the multivariate normal distributions with parameters provided by point.

sample(point, n_samples=1)[source]#

Sample from a multivariate normal distribution.

Parameters:
  • point (array-like, shape=[…, sample_dim + sample_dim * 2]*) – Point representing a multivariate normal distribution. First \(sample\_dim\) elements contain the mean vector and the last \(sample\_dim ** 2\) elements contain the covariance matrix row by row.

  • n_samples (int) – Number of points to sample with each parameter in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples, sample_dim]) – Sample from multivariate normal distributions.

class geomstats.information_geometry.normal.MultivariateNormalDistributionsRandomVariable(space)[source]#

Bases: ScipyMultivariateRandomVariable

A multivariate normal random variable.

class geomstats.information_geometry.normal.NormalDistributions(sample_dim, distribution_type='general', equip=True)[source]#

Bases: object

Class for the normal distributions.

This class is a common interface to the following different situations:

  • univariate normal distributions

  • centered multivariate normal distributions

  • multivariate normal distributions with diagonal covariance matrix

  • general multivariate normal distributions

Parameters:
  • sample_dim (int) – Dimension of the sample space of the normal distribution.

  • distribution_type (str, {‘centered’, ‘diagonal’, ‘general’}) – Type of distributions. Optional, default: ‘general’.

class geomstats.information_geometry.normal.SharedMeanNormalDistributionsRandomVariable(space)[source]#

Bases: ScipyMultivariateRandomVariable

A multivariate normal random variable with zero mean.

class geomstats.information_geometry.normal.UnivariateNormalDistributions(equip=True)[source]#

Bases: InformationManifoldMixin, PoincareHalfSpace

Class for the manifold of univariate normal distributions.

This is upper half-plane.

static default_metric()[source]#

Metric to equip the space with if equip is True.

point_to_pdf(point)[source]#

Compute pdf associated to point.

Compute the probability density function of the normal distribution with parameters provided by point.

Parameters:

point (array-like, shape=[…, 2]) – Point representing a normal distribution (mean and scale).

Returns:

pdf (function) – Probability density function of the normal distribution with parameters provided by point.

static random_point(n_samples=1, bound=1.0)[source]#

Sample parameters of normal distributions.

The uniform distribution on [-bound/2, bound/2]x[0, bound] is used.

Parameters:
  • n_samples (int) – Number of samples. Optional, default: 1.

  • bound (float) – Side of the square where the normal parameters are sampled. Optional, default: 5.

Returns:

samples (array-like, shape=[…, 2]) – Sample of points representing normal distributions.

sample(point, n_samples=1)[source]#

Sample from the normal distribution.

Sample from the normal distribution with parameters provided by point.

Parameters:
  • point (array-like, shape=[…, 2]) – Point representing a normal distribution (mean and scale).

  • n_samples (int) – Number of points to sample with each pair of parameters in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples]) – Sample from normal distributions.

class geomstats.information_geometry.normal.UnivariateNormalDistributionsRandomVariable(space)[source]#

Bases: ScipyUnivariateRandomVariable

A univariate normal random variable.

class geomstats.information_geometry.normal.UnivariateNormalMetric(space)[source]#

Bases: PullbackDiffeoMetric

Class for the Fisher information metric on univariate normal distributions.

This is the pullback of the metric of the Poincare upper half-plane by the diffeomorphism \((mean, std) -> (mean, \sqrt{2} std)\).

static metric_matrix(base_point)[source]#

Compute the metric matrix at the tangent space at base_point.

Parameters:

base_point (array-like, shape=[…, 2]) – Point representing a normal distribution (mean and scale).

Returns:

mat (array-like, shape=[…, 2, 2]) – Metric matrix.

sectional_curvature(tangent_vec_a, tangent_vec_b, base_point=None)[source]#

Compute the sectional curvature.

In the literature sectional curvature is noted K.

For two orthonormal tangent vectors \(x,y\) at a base point, the sectional curvature is defined by \(K(x,y) = <R(x, y)x, y>\).

For non-orthonormal vectors, it is \(K(x,y) = <R(x, y)y, x> / (<x, x><y, y> - <x, y>^2)\).

sectional_curvature(X, Y, P) = K(X,Y) where X, Y are tangent vectors at base point P.

The information manifold of univariate normal distributions has constant sectional curvature given by \(K = - 1/2\).

Parameters:
  • tangent_vec_a (array-like, shape=[…, 2]) – Tangent vector at base_point.

  • tangent_vec_b (array-like, shape=[…, 2]) – Tangent vector at base_point.

  • base_point (array-like, shape=[…, 2]) – Point in the manifold.

Returns:

sectional_curvature (array-like, shape=[…,]) – Sectional curvature at base_point.

class geomstats.information_geometry.normal.UnivariateNormalToPoincareHalfSpaceDiffeo[source]#

Bases: Diffeo

Diffeomorphism from univariate normal to Poincare half space.

diffeomorphism(base_point)[source]#

Image of base point in the Poincare upper half-plane.

This is the image by the diffeomorphism \((mean, std) -> (mean, \sqrt{2} std)\).

Parameters:

base_point (array-like, shape=[…, 2]) – Point representing a normal distribution. Coordinates are mean and standard deviation.

Returns:

image_point (array-like, shape=[…, 2]) – Image of base_point in the Poincare upper half-plane.

inverse_diffeomorphism(image_point)[source]#

Inverse image of a point in the Poincare upper half-plane.

This is the inverse image by the diffeomorphism \((mean, std) -> (mean, \sqrt{2} std)\).

Parameters:

image_point (array-like, shape=[…, 2]) – Point in the upper half-plane.

Returns:

base_point (array-like, shape=[…, 2]) – Inverse image of the image point, representing a normal distribution. Coordinates are mean and standard deviation.

inverse_tangent_diffeomorphism(image_tangent_vec, image_point=None, base_point=None)[source]#

Inverse image of tangent vector.

This is the inverse image by the tangent map of the diffeomorphism \((mean, std) -> (mean, \sqrt{2} std)\).

Parameters:
  • image_tangent_vec (array-like, shape=[…, 2]) – Image of a tangent vector at image_point.

  • image_point (array-like, shape=[…, 2]) – Image of a point representing a normal distribution.

  • base_point (array-like, shape=[…, *shape]) – Base point.

Returns:

tangent_vec (array-like, shape=[…, 2]) – Inverse image of image_tangent_vec.

tangent_diffeomorphism(tangent_vec, base_point=None, image_point=None)[source]#

Image of tangent vector.

This is the image by the tangent map of the diffeomorphism \((mean, std) -> (mean, \sqrt{2} std)\).

Parameters:
  • tangent_vec (array-like, shape=[…, 2]) – Tangent vector at base point.

  • base_point (array-like, shape=[…, 2]) – Base point representing a normal distribution.

  • image_point (array-like, shape=[…, *shape]) – Image point.

Returns:

image_tangent_vec (array-like, shape=[…, 2]) – Image tangent vector at image of the base point.

geomstats.information_geometry.poisson module#

Statistical Manifold of Poisson distributions with the Fisher metric.

Lead author: Tra My Nguyen.

class geomstats.information_geometry.poisson.PoissonDistributions(equip=True)[source]#

Bases: InformationManifoldMixin, VectorSpaceOpenSet

Class for the manifold of Poisson distributions.

This is the parameter space of Poisson distributions i.e. [the half-line of positive reals].

belongs(point, atol=1e-12)[source]#

Evaluate if a point belongs to the manifold of Poisson distributions.

Parameters:
  • point (array-like, shape=[…, 1]) – Point to be checked.

  • atol (float) – Tolerance to evaluate positivity. Optional, default: gs.atol

Returns:

belongs (array-like, shape=[…,]) – Boolean indicating whether point represents an Poisson distribution.

static default_metric()[source]#

Metric to equip the space with if equip is True.

point_to_pdf(point)[source]#

Compute pmf associated to point.

Compute the probability mass function of the Poisson distribution with parameters provided by point.

Parameters:

point (array-like, shape=[…, 1]) – Point representing an Poisson distribution (scale).

Returns:

pdf (function) – Probability mass function of the Poisson distribution with scale parameter provided by point.

projection(point, atol=1e-12)[source]#

Project a point in ambient space to the open set.

Return a point belonging to the half-line of positive reals within the given tolerance.

Parameters:
  • point (array-like, shape=[…, 1]) – Point in ambient space.

  • atol (float) – Tolerance to evaluate positivity.

Returns:

projected (array-like, shape=[…, 1]) – Projected point.

random_point(n_samples=1, bound=1.0)[source]#

Sample parameters of Poisson distributions.

The uniform distribution on (0, bound) is used.

Parameters:
  • n_samples (int) – Number of samples. Optional, default: 1.

  • bound (float) – Right-end ot the segment where Poisson parameters are sampled. Optional, default: 1.

Returns:

samples (array-like, shape=[n_samples,]) – Sample of points representing Poisson distributions.

sample(point, n_samples=1)[source]#

Sample from the Poisson distribution.

Sample from the Poisson distribution with parameter provided by point.

Parameters:
  • point (array-like, shape=[…, 1]) – Point representing an Poisson distribution.

  • n_samples (int) – Number of points to sample with each parameter in point. Optional, default: 1.

Returns:

samples (array-like, shape=[…, n_samples]) – Sample from Poisson distributions.

class geomstats.information_geometry.poisson.PoissonDistributionsRandomVariable(space)[source]#

Bases: ScipyUnivariateRandomVariable

A Poisson random variable.

pdf(x, point)[source]#

Evaluate the probability density function at x.

Parameters:
  • x (array-like, shape=[n_samples, *support_shape]) – Sample points at which to compute the probability density function.

  • point (array-like, shape=[…, *space.shape]) – Point representing a distribution.

Returns:

pdf_at_x (array-like, shape=[…, n_samples, *support_shape]) – Values of pdf at x for each value of the parameters provided by point.

class geomstats.information_geometry.poisson.PoissonMetric(space, signature=None)[source]#

Bases: RiemannianMetric

Class for the Fisher information metric on Poisson distributions.

References

[AM1981]

Atkinson, C., & Mitchell, A. F. (1981). Rao’s distance measure. Sankhyā: The Indian Journal of Statistics, Series A, 345-365.

exp(tangent_vec, base_point)[source]#

Compute exp map of a base point in tangent vector direction.

Parameters:
  • tangent_vec (array-like, shape=[…, 1]) – Tangent vector at base point.

  • base_point (array-like, shape=[…, 1]) – Base point.

Returns:

exp (array-like, shape=[…, 1]) – End point of the geodesic starting at base_point with initial velocity tangent_vec.

log(point, base_point)[source]#

Compute log map using a base point and an end point.

Parameters:
  • point (array-like, shape=[…, 1]) – End point.

  • base_point (array-like, shape=[…, 1]) – Base point.

Returns:

tangent_vec (array-like, shape=[…, 1]) – Initial velocity of the geodesic starting at base_point and reaching point at time 1.

metric_matrix(base_point)[source]#

Compute the metric matrix at the tangent space at base_point.

Parameters:

base_point (array-like, shape=[…, 1]) – Point representing a Poisson distribution.

Returns:

mat (array-like, shape=[…, 1, 1]) – Metric matrix.

squared_dist(point_a, point_b)[source]#

Compute squared distance associated with the Poisson metric.

Parameters:
  • point_a (array-like, shape=[…, 1]) – Point representing an Poisson distribution (lambda parameter).

  • point_b (array-like, shape=[…, 1]) – Point representing a Poisson distribution (lambda parameter).

Returns:

squared_dist (array-like, shape=[…,]) – Squared distance between points point_a and point_b.

Module contents#