geomstats.geometry.stratified package#

Submodules#

geomstats.geometry.stratified.graph_space module#

Graph Space.

Lead author: Anna Calissano.

class geomstats.geometry.stratified.graph_space.ExhaustiveAligner[source]#

Bases: _BaseAligner

Brute force exact alignment.

Exact Alignment obtained by exploring the whole permutation group.

Notes

Not recommended for large n_nodes.

align(space, base_graph, graph_to_permute)[source]#

Align graphs.

Parameters:
  • space (GraphSpace)

  • base_graph (array-like, shape=[…, n_nodes, n_nodes]) – Base graph.

  • graph_to_permute (array-like, shape=[…, n_nodes, n_nodes]) – Graph to align.

Returns:

permuted_graph (array-like, shape=[…, n_nodes, n_nodes]) – Permuted graph as to be aligned with respect to the geodesic.

class geomstats.geometry.stratified.graph_space.FAQAligner[source]#

Bases: _BaseAligner

Fast Quadratic Assignment for graph matching (or network alignment).

References

[Vogelstein2015]

Vogelstein JT, Conroy JM, Lyzinski V, Podrazik LJ, Kratzer SG, Harley ET, Fishkind DE, Vogelstein RJ, Priebe CE. “Fast approximate quadratic programming for graph matching.“ PLoS One. 2015 Apr 17; doi: 10.1371/journal.pone.0121002.

align(space, base_graph, graph_to_permute)[source]#

Align graphs.

Parameters:
  • space (GraphSpace)

  • base_graph (array-like, shape=[…, n_nodes, n_nodes]) – Base graph.

  • graph_to_permute (array-like, shape=[…, n_nodes, n_nodes]) – Graph to align.

Returns:

permuted_graph (array-like, shape=[…, n_nodes, n_nodes]) – Permuted graph as to be aligned with respect to the geodesic.

class geomstats.geometry.stratified.graph_space.GraphPoint(adj)[source]#

Bases: Point

Class for the GraphPoint.

Points are represented by \(nodes \times nodes\) adjacency matrices.

Parameters:

adj (array-like, shape=[n_nodes, n_nodes]) – Adjacency matrix.

References

[Calissano2020]

Calissano, A., Feragen, A., Vantini, S. “Graph Space: Geodesic Principal Components for a Population of Network-valued Data.” Mox report 14, 2020. https://mox.polimi.it/reports-and-theses/publication-results/?id=855.

property n_nodes#

Retrieve the number of nodes.

to_array()[source]#

Return a copy of the adjacency matrix.

to_networkx()[source]#

Turn the graph into a networkx format.

class geomstats.geometry.stratified.graph_space.GraphSpace(n_nodes, total_space=None, equip=True)[source]#

Bases: PointSet

Class for the Graph Space.

Graph Space to analyse populations of labelled and unlabelled graphs. The space focuses on graphs with scalar euclidean attributes on nodes and edges, with a finite number of nodes and both directed and undirected edges. For undirected graphs, use symmetric adjacency matrices. The space is a quotient space obtained by applying the permutation action of nodes to the space of adjacency matrices. Notice that for computation reasons the module works with both the gs.array representation of graph and the GraphPoint representation.

Points are represented by \(nodes \times nodes\) adjacency matrices. Both the array input and the Graph Point type input work.

Parameters:
  • n_nodes (int) – Number of graph nodes

  • total_space (space) – Total Space before applying the permutation action. Default: Euclidean Space.

References

[Calissano2020]

Calissano, A., Feragen, A., Vantini, S. “Graph Space: Geodesic Principal Components for a Population of Network-valued Data.” Mox report 14, 2020. https://mox.polimi.it/reports-and-theses/publication-results/?id=855.

[Jain2009]

Jain, B., Obermayer, K. “Structure Spaces.” Journal of Machine Learning Research, 10(11), 2009. https://www.jmlr.org/papers/volume10/jain09a/jain09a.pdf

belongs(graphs, atol=1e-12)[source]#

Check if the point belongs to the space.

Parameters:
  • graphs (list of GraphPoint or array-like, shape=[…, n_nodes, n_nodes]) – Points to be checked.

  • atol (float) – Tolerance. Optional, default: backend atol.

Returns:

belongs (array-like, shape=[…, n_nodes]) – Boolean denoting if graph belongs to the space.

static default_metric()[source]#

Metric to equip the space with if equip is True.

pad_with_zeros(points)[source]#

Pad points with zeros to match space dimension.

Parameters:

points (list of GraphPoint or array-like, shape=[…, n_nodes, n_nodes])

permute(graph_to_permute, permutation)[source]#

Permutation action applied to graph observation.

Parameters:
  • graph_to_permute (list of GraphPoint or array-like,) – shape=[…, n_nodes, n_nodes] Input graphs to be permuted.

  • permutation (array-like, shape=[…, n_nodes]) – Node permutations where in position i we have the value j meaning the node i should be permuted with node j.

Returns:

graphs_permuted (array-like, shape=[…, n_nodes, n_nodes]) – Graphs permuted.

random_point(n_samples=1, bound=1.0)[source]#

Sample in Graph Space.

Parameters:
  • n_samples (int) – Number of samples. Optional, default: 1.

  • bound (float) – Bound of the interval in which to sample in the tangent space. Optional, default: 1.

Returns:

graph_samples (array-like, shape=[…, n_nodes, n_nodes]) – Points sampled in GraphSpace(n_nodes).

set_to_array(points)[source]#

Return a copy of the adjacency matrices.

Parameters:

points (list of GraphPoint or array-like, shape=[…, n_nodes, n_nodes]) – Points to be turned into an array

Returns:

graph_array (array-like, shape=[…, n_nodes, n_nodes]) – An array containing all the Graphs.

set_to_networkx(points)[source]#

Turn points into a networkx object.

Parameters:

points (list of GraphPoint or array-like, shape=[…, n_nodes, n_nodes])

Returns:

nx_list (list of networkx object) – An array containing all the Graphs.

class geomstats.geometry.stratified.graph_space.GraphSpaceMetric(space)[source]#

Bases: PointSetMetric

Class for the Graph Space Metric.

Every metric \(d: X \times X \rightarrow \mathbb{R}\) on the total space of adjacency matrices can descend onto the quotient space as a pseudo-metric: \(d([x_1],[x_2]) = min_{t\in T} d_X(x_1, t^Tx_2t)\). The metric relies on the total space metric and an alignment procedure, i.e., Graph Matching or Networks alignment procedure. Metric, alignment, geodesics, and alignment with respect to a geodesic are defined. By default, the alignment is the identity and the total space metric is the Frobenious norm.

Parameters:

space (GraphSpace) – GraphSpace object.

References

[Calissano2020]

Calissano, A., Feragen, A., Vantini, S. “Graph Space: Geodesic Principal Components for a Population of Network-valued Data.” Mox report 14, 2020. https://mox.polimi.it/reports-and-theses/publication-results/?id=855.

MAP_ALIGNER = {'FAQ': <class 'geomstats.geometry.stratified.graph_space.FAQAligner'>, 'ID': <class 'geomstats.geometry.stratified.graph_space.IDAligner'>, 'exhaustive': <class 'geomstats.geometry.stratified.graph_space.ExhaustiveAligner'>}#
align_point_to_geodesic(geodesic, graph_to_permute)[source]#

Align graph to a geodesic.

Using the selected alignment technique, it returns the permuted graph_to_permute as optimally aligned to the geodesic using [Huckemann2010].

Parameters:
  • geodesic (function) – Geodesic in Graph Space function.

  • graph_to_permute (list of Graph or array-like, shape=[…, n_nodes, n_nodes]) – Graph to align.

Returns:

permuted_graph (list, shape = […, n_nodes, n_nodes])

References

[Huckemann2010]

Huckemann, S., Hotz, T., Munk, A. “Intrinsic shape analysis: Geodesic PCA for Riemannian manifolds modulo isometric Lie group actions.” Statistica Sinica, 1-58, 2010.

align_point_to_point(base_graph, graph_to_permute)[source]#

Align graphs.

Using the selected alignment technique, it returns the permuted graph_to_permute as optimally aligned to the base_graph.

Parameters:
  • base_graph (list of Graph or array-like, shape=[…, n_nodes, n_nodes]) – Base graph.

  • graph_to_permute (list of Graph or array-like, shape=[…, n_nodes, n_nodes]) – Graph to align.

Returns:

permuted_graph (list, shape = […, n_nodes, n_nodes])

dist(graph_a, graph_b)[source]#

Compute distance between two equivalence classes.

Compute the distance between two equivalence classes of adjacency matrices [Jain2009].

Parameters:
  • graph_a (list of GraphPoint or array-like, shape=[…, n_nodes, n_nodes])

  • graph_b (list of GraphPoint or array-like, shape=[…, n_nodes, n_nodes])

Returns:

distance (array-like, shape=[…]) – distance between equivalence classes.

References

[Jain2009]

Jain, B., Obermayer, K. “Structure Spaces.” Journal of Machine Learning Research 10.11 (2009). https://www.jmlr.org/papers/v10/jain09a.html.

geodesic(base_point, end_point)[source]#

Compute geodesic between two equivalence classes.

Compute the geodesic between two equivalence classes of adjacency matrices [Calissano2020].

Parameters:
  • base_point (list of GraphPoint or array-like, shape=[…, n_nodes, n_nodes]) – Start .

  • end_point (list of GraphPoint or array-like, shape=[…, n_nodes, n_nodes]) – Second graph to align to the first graph.

Returns:

geodesic (callable) – Geodesic function.

References

[Calissano2020]

Calissano, A., Feragen, A., Vantini, S.

“Graph Space: Geodesic Principal Components for a Population of Network-valued Data.” Mox report 14, 2020. https://mox.polimi.it/reports-and-theses/publication-results/?id=855.

property n_nodes#

Retrieve the number of nodes.

property perm_#

Permutation of nodes after alignment.

Node permutations where in position i we have the value j meaning the node i should be permuted with node j.

set_aligner(aligner, **kwargs)[source]#

Set the aligning strategy.

Graph Space metric relies on alignment. In this module we propose the identity matching, the FAQ graph matching by [Vogelstein2015], and exhaustive aligner which explores the whole permutation group.

Parameters:

aligner (str) – ‘ID’ Identity ‘FAQ’ Fast Quadratic Assignment - only compatible with Frobenious norm ‘exhaustive’ all group exhaustive search

References

[Vogelstein2015]

Vogelstein JT, Conroy JM, Lyzinski V, Podrazik LJ, Kratzer SG, Harley ET, Fishkind DE, Vogelstein RJ, Priebe CE. “Fast approximate quadratic programming for graph matching.“ PLoS One. 2015 Apr 17; doi: 10.1371/journal.pone.0121002.

set_point_to_geodesic_aligner(aligner, **kwargs)[source]#

Set the alignment between a point and a geodesic.

Following the geodesic to point alignment in [Calissano2020] and [Huckemann2010], this function defines the parameters [s_min, s_max] and the number of points to sample in the domain.

Parameters:
  • s_min (float) – Minimum value of the domain to sample along the geodesics.

  • s_max (float) – Minimum value of the domain to sample along the geodesics.

  • n_points (int) – Number of points to sample between s_min and s_max.

References

[Calissano2020]

Calissano, A., Feragen, A., Vantini, S. “Graph Space: Geodesic Principal Components for a Population of Network-valued Data.” Mox report 14, 2020. https://mox.polimi.it/reports-and-theses/publication-results/?id=855.

[Huckemann2010]

Huckemann, S., Hotz, T., Munk, A. “Intrinsic shape analysis: Geodesic PCA for Riemannian manifolds modulo isometric Lie group actions.” Statistica Sinica, 1-58, 2010.

property total_space_metric#

Retrieve the total space metric.

class geomstats.geometry.stratified.graph_space.IDAligner[source]#

Bases: _BaseAligner

Identity alignment.

The identity alignment is not performing any matching but returning the nodes in their original position. This alignment can be selected when working with labelled graphs.

align(space, base_graph, graph_to_permute)[source]#

Align graphs.

Parameters:
  • space (GraphSpace)

  • base_graph (array-like, shape=[…, n_nodes, n_nodes]) – Base graph.

  • graph_to_permute (array-like, shape=[…, n_nodes, n_nodes]) – Graph to align.

Returns:

permuted_graph (array-like, shape=[…, n_nodes, n_nodes]) – Permuted graph as to be aligned with respect to the geodesic.

class geomstats.geometry.stratified.graph_space.PointToGeodesicAligner(s_min, s_max, n_points=10)[source]#

Bases: _BasePointToGeodesicAligner

Class for the Alignment of the points with respect to a geodesic.

Implementing the algorithm in [Huckemann2010] to select an optimal alignment to a point with respect to a geodesic. The algorithm sample discrete set of n_points along the geodesic between [s_min, s_max] and find the permutation that gets closer to the datapoints along the geodesic.

Parameters:
  • space (GraphSpace)

  • s_min (float) – Minimum value of the domain to sample along the geodesics.

  • s_max (float) – Minimum value of the domain to sample along the geodesics.

  • n_points (int) – Number of points to sample between s_min and s_max.

References

[Calissano2020]

Calissano, A., Feragen, A., Vantini, S. “Graph Space: Geodesic Principal Components for a Population of Network-valued Data.” Mox report 14, 2020. https://mox.polimi.it/reports-and-theses/publication-results/?id=855.

[Huckemann2010]

Huckemann, S., Hotz, T., Munk, A. “Intrinsic shape analysis: Geodesic PCA for Riemannian manifolds modulo isometric Lie group actions.” Statistica Sinica, 1-58, 2010.

align(space, geodesic, graph_to_permute)[source]#

Align the graph to the geodesic.

Parameters:
  • geodesic (function) – Geodesic function in GraphSpace.

  • graph_to_permute (array-like, shape=[…, n_nodes, n_nodes]) – Graph to align.

Returns:

permuted_graph (array-like, shape=[…, n_nodes]) – Permuted graph as to be aligned with respect to the geodesic.

dist(space, geodesic, graph_to_permute)[source]#

Compute the distance between the geodesic and the point.

Parameters:
  • space (GraphSpace)

  • geodesic (function) – Geodesic function in GraphSpace.

  • graph_to_permute (array-like, shape=[…, n_nodes, n_nodes]) – Graph to align.

Returns:

dist (array-like, shape=[…, n_nodes]) – Distance between the graph_to_permute and the geodesic.

property s#

Save the domain discretization.

geomstats.geometry.stratified.point_set module#

Class for Stratified Spaces.

Lead authors: Anna Calissano & Jonas Lueg

class geomstats.geometry.stratified.point_set.Point[source]#

Bases: ABC

Class for points of a set.

abstract to_array()[source]#

Turn the point into a numpy array.

Returns:

array_point (array-like, shape=[…]) – An array representation of the Point type.

class geomstats.geometry.stratified.point_set.PointSet(equip=True)[source]#

Bases: ABC

Class for a set of points of type Point.

abstract belongs(point, atol)[source]#

Evaluate if a point belongs to the set.

Parameters:
  • point (Point-like, shape=[…]) – Point to evaluate.

  • atol (float) – Absolute tolerance. Optional, default: backend atol.

Returns:

belongs (array-like, shape=[…]) – Boolean evaluating if point belongs to the set.

equip_with_metric(Metric=None, **metric_kwargs)[source]#

Equip manifold with Metric.

Parameters:

Metric (PointSetMetric object) – If None, default metric will be used.

abstract random_point(n_samples=1)[source]#

Sample random points on the PointSet.

Parameters:

n_samples (int) – Number of samples. Optional, default: 1.

Returns:

samples (List of Point) – Points sampled on the PointSet.

abstract set_to_array(points)[source]#

Convert a set of points into an array.

Parameters:

points (list of Point, shape=[…]) – Number of samples of point type to turn into an array.

Returns:

points_array (array-like, shape=[…]) – Points sampled on the PointSet.

class geomstats.geometry.stratified.point_set.PointSetMetric(space)[source]#

Bases: ABC

Class for the lenght spaces.

abstract dist(point_a, point_b, **kwargs)[source]#

Distance between two points in the PointSet.

Parameters:
  • point_a (Point or List of Point, shape=[…]) – Point in the PointSet.

  • point_b (Point or List of Point, shape=[…]) – Point in the PointSet.

Returns:

distance (array-like, shape=[…]) – Distance.

abstract geodesic(initial_point, end_point, **kwargs)[source]#

Compute the geodesic in the PointSet.

Parameters:
  • initial_point (Point or List of Points, shape=[…]) – Point in the PointSet.

  • end_point (Point or List of Points, shape=[…]) – Point in the PointSet.

Returns:

path (callable) – Time parameterized geodesic curve.

geomstats.geometry.stratified.point_set.broadcast_lists(list_a, list_b)[source]#

Broadcast two lists.

Similar behavior as gs.broadcast_arrays, but for lists.

geomstats.geometry.stratified.spider module#

Class for the spider.

Lead authors: Anna Calissano & Jonas Lueg

class geomstats.geometry.stratified.spider.Spider(n_rays, equip=True)[source]#

Bases: PointSet

Spider: a set of rays attached to the origin.

The k-spider consists of k copies of the positive real line \(\mathbb{R}_{\geq 0}\) glued together at the origin [Feragen2020].

Parameters:

n_rays (int) – Number of rays to attach to the origin. Note that zero counts as the origin not as a ray.

References

[Feragen2020]

Feragen, Aasa, and Tom Nye. “Statistics on stratified spaces.” Riemannian Geometric Statistics in Medical Image Analysis. Academic Press, 2020. 299-342.

belongs(point)[source]#

Check if a random point belongs to the spider set.

Parameters:

point (SpiderPoint or list of SpiderPoint, shape=[…]) – Point to be checked.

Returns:

belongs (array-like, shape=[…]) – Boolean denoting if the SpiderPoint belongs to the Spider Set.

static default_metric()[source]#

Metric to equip the space with if equip is True.

random_point(n_samples=1)[source]#

Compute a random point of the spider set.

Parameters:

n_samples (int) – Number of samples. Optional, default: 1.

Returns:

samples (list of SpiderPoint, shape=[…]) – List of SpiderPoints randomly sampled from the Spider.

set_to_array(point)[source]#

Turn a point into an array compatible with the dimension of the space.

Parameters:

point (SpiderPoint or list of SpiderPoint, shape=[…]) – Points to be checked.

Returns:

point_array (array-like, shape=[…,n_rays]) – An array with the stratum_coord parameter in the stratum position.

class geomstats.geometry.stratified.spider.SpiderMetric(space, ray_metric=None)[source]#

Bases: PointSetMetric

Geometry on the Spider, induced by the rays Geometry.

dist(point_a, point_b)[source]#

Compute the distance between two points on the Spider using the ray geometry.

The spider metric is the metric in each ray extended to the Spider: given two points x, y on different rays, d(x, y) = d(x, 0) + d(0, y).

Parameters:
  • point_a (SpiderPoint or list of SpiderPoint, shape=[…]) – Point in the Spider.

  • point_b (SpiderPoint or list of SpiderPoint, shape=[…]) – Point in the Spider.

Returns:

point_array (array-like, shape=[…]) – An array with the distance.

geodesic(initial_point, end_point)[source]#

Return the geodesic between two lists of Spider points.

Parameters:
  • initial_point (SpiderPoint or list of SpiderPoint, shape=[…]) – Point in the Spider.

  • end_point (SpiderPoint or list of SpiderPoint, shape=[…]) – Point in the Spider.

Returns:

geo (function) – Return a vectorized geodesic function.

property n_rays#

Get number of rays.

class geomstats.geometry.stratified.spider.SpiderPoint(stratum, stratum_coord)[source]#

Bases: Point

Class for points of the Spider.

A point in the Spider is \((s,c) \in \mathbb{N} \times \mathbb{R}\).

Parameters:
  • stratum (int) – The stratum, an integer indicating the stratum the point lies in. If zero, then the point is on the origin.

  • stratum_coord (float) – A positive number, the coordinate of the point. It must be zero if and only if the stratum is zero, i.e. the origin.

to_array()[source]#

Return the hash of the instance.

geomstats.geometry.stratified.wald_space module#

Classes for the Wald Space and elements therein of class Wald and helper classes.

Class Split. Essentially, a Split is a two-set partition of a subset of \(\{0,\dots,n-1\}\). This class is designed such that one part of both parts of the partition can be empty. Splits are corresponding uniquely to edges in a phylogenetic forest, where, if one cuts the edge in the forest, the resulting two-set partition of the labels of the respective component of the forest is the corresponding split.

Class Structure. A structure is a partition into non-empty sets of the set \(\{0,\dots,n-1\}\), together with a set of splits for each element of the partition, where every split is a two-set partition of the respective element. A structure basically describes a phylogenetic forest, where each set of splits gives the structure of the tree with the labels of the corresponding element of the partition.

Class Wald. A wald is essentially a phylogenetic forest with weights between zero and one on the edges. The forest structure is stored as a Structure and the edge weights are an array of length that is equal to the total number of splits in the structure. These elements are the points in Wald space and other phylogenetic forest spaces, like BHV space, although the partition is just the whole set of labels in this case.

Class WaldSpace. A topological space. Points in Wald space are instances of the class Wald: phylogenetic forests with edge weights between 0 and 1. In particular, Wald space is a stratified space, each stratum is called grove. The highest dimensional groves correspond to fully resolved or binary trees. The topology is obtained from embedding wälder into the ambient space of strictly positive \(n\times n\) symmetric matrices, implemented in the class spd.SPDMatrices.

Lead author: Jonas Lueg

References

[Garba21] Garba, M. K., T. M. W. Nye, J. Lueg and S. F. Huckemann.

“Information geometry for phylogenetic trees” Journal of Mathematical Biology, 82(3):19, February 2021a. https://doi.org/10.1007/s00285-021-01553-x.

[Lueg21] Lueg, J., M. K. Garba, T. M. W. Nye, S. F. Huckemann.

“Wald Space for Phylogenetic Trees.” Geometric Science of Information, Lecture Notes in Computer Science, pages 710–717, Cham, 2021. https://doi.org/10.1007/978-3-030-80209-7_76.

class geomstats.geometry.stratified.wald_space.Split(part1, part2)[source]#

Bases: object

Class for two-set partitions of sets.

Two-set partitions of a smaller subset of \(\{0,...,n-1\}\) are also allowed, where \(n\) is a positive integer, which is not passed as an argument as it is nowhere needed.

The parameters part1 and part2 are assigned to the attributes self.part1 and self.part2 in a unique sorted way: the one that contains the smallest minimal value is assigned to self.part1 for consistency.

Parameters:
  • part1 (iterable) – The first part of the split, an iterable that is a subset of \(\{0,\dots,n-1\}\). It may be empty, but must have empty intersection with part2.

  • part2 (iterable) – The second part of the split, an iterable that is a subset of \(\{0,\dots,n-1\}\). It may be empty, but must have empty intersection with part1.

get_part_away_from(other)[source]#

Return the part of this split that is directed away from other split.

Parameters:

other (Split) – The other split.

Returns:

part_that_does_not_point (iterable) – Return the part of the split self that does not point toward other. See self.get_part_towards for further explanation.

get_part_towards(other)[source]#

Return the part of this split that is directed toward the other split.

Each split contains part1 and part2, the parts that the corresponding edge in the graph splits the set of labels into. Thus, one can think of the split as an edge, where part1 points in the direction of the part of the tree where the labels of part1 are contained, and part2 points in the other direction. So, part1 points in the direction of other, if it corresponds to an edge that is contained in the tree that part1 points to, else part2 points in the direction of split.

Parameters:

other (Split) – The other split.

Returns:

part_towards (iterable) – Return the part of the split self that points toward other_split.

is_compatible(other)[source]#

Check whether this split is compatible with another split.

Two splits are compatible, if at least one intersection of the respective parts of the splits is empty.

Parameters:

other (Split) – The other split.

Returns:

is_compatible_with (bool) – Return True if the splits are compatible, else False.

part_contains(subset: set)[source]#

Determine if a subset is contained in either part of a split.

Parameters:

subset (set) – The subset containing labels.

Returns:

is_contained (bool) – A boolean that is true if the subset is contained in self.part1 or self.part2.

restrict_to(subset: set)[source]#

Return the restriction of a split to a subset.

Parameters:

subset (set) – The subset that the split is restricted to.

Returns:

restr_split (Split) – The restricted split, if the split is \(A\vert B\), then the split restricted to the subset \(C\) is \(A\cap C\vert B\cap C\).

separates(u, v)[source]#

Determine whether the labels (or label sets) are separated by the split.

Parameters:
  • u (list of int, int) – Either an integer or a set of labels.

  • v (list of int, int) – Either an integer or a set of labels.

Returns:

is_separated (bool) –

A boolean determining whether u and v are separated by the split (i.e. if

they are not in the same part).

class geomstats.geometry.stratified.wald_space.Topology(n_labels, partition, split_sets)[source]#

Bases: object

The topology of a forest, using a split-based graph-structure representation.

Parameters:
  • n_labels (int) – Number of labels, the set of labels is then \(\{0,\dots,n-1\}\).

  • partition (tuple) – A tuple of tuples that is a partition of the set \(\{0,\dots,n-1\}\), representing the label sets of each connected component of the forest topology.

  • split_sets (tuple) – A tuple of tuples containing splits, where each set of splits contains only splits of the respective label set in the partition, so their order is related. The splits are the edges of the connected components of the forest, respectively, and thus the union of all sets of splits yields all edges of the forest topology.

where#

Give the index of a split in the flattened list of all splits.

Type:

dict

sep#

An increasing list of numbers between 0 and m, where m is the total number of splits in self.split_sets, starting with 0, where each number indicates that a new connected component starts at that index. Useful for example for unraveling the tuple of all splits into self.split_sets.

Type:

list of int

paths#

A list of dictionaries, each dictionary is for the respective connected component of the forest, and the items of each dictionary are for each pair of labels u, v, u < v in the respective component, a list of the splits on the unique path between the labels u and v.

Type:

list of dict

support#

For each split, give an \(n\times n\) dimensional matrix, where the uv-th entry is True if the split separates the labels u and v, else False.

Type:

list of array-like

corr(x)[source]#

Compute the correlation matrix of the topology with edge weights weights.

Parameters:

x (array-like) – Takes a vector of length ‘number of total splits’ of the structure.

Returns:

corr (array-like, shape=[n, n]) – Returns the corresponding correlation matrix.

corr_gradient(x)[source]#

Compute the gradient of the correlation matrix, differentiated by weights.

Parameters:

x (array-like) – The vector weights at which the gradient is computed.

Returns:

gradient (array-like, shape=[n_splits, n, n]) – The gradient of the correlation matrix, differentiated by weights.

static flatten(x)[source]#

Flatten a list of lists into a single list by concatenation.

Parameters:

x (nested list) – The nested list to flatten.

Returns:

x_flat (list, tuple) – The flatted list.

unflatten(x)[source]#

Transform list into list of lists according to separators, self.sep.

The separators are a list of integers, increasing. Then, all elements between to indices in separators will be put into a list, and together, all lists give a nested list.

Parameters:

x (iterable) – The flat list that will be nested.

Returns:

x_nested (list[list]) – The nested list of lists.

class geomstats.geometry.stratified.wald_space.Wald(topology: Topology, weights)[source]#

Bases: Point

A class for wälder, that are phylogenetic forests, elements of the Wald Space.

Parameters:
  • topology (Topology) – The structure of the forest.

  • weights (array-like) – The edge weights, array of floats between 0 and 1, with m entries, where m is the total number of splits/edges in the structure top.

static generate_wald(n_labels, p_keep, p_new, btol=1e-08, check=True)[source]#

Generate a random instance of class Wald.

Parameters:
  • n_labels (int) – The number of labels the wald is generated with respect to.

  • p_keep (float) – The probability will be inserted into the generation of a partition as well as for the generation of a split set for the topology of the wald.

  • p_new (float) – A float between 0 and 1, the probability that no new component is added, and probability of 1 - p_new_ that a new component is added.

  • btol (float) – Tolerance for the boundary of the coordinates in each grove. Defaults to 1e-08.

  • check (bool) – If True, checks if splits still separate all labels. In this case, the split will not be deleted. If False, any split can be randomly deleted.

Returns:

random_wald (Wald) – The randomly generated wald.

property n_labels#

Get number of labels.

to_array()[source]#

Turn the wald into a numpy array, namely its correlation matrix.

Returns:

array_of_wald (array-like, shape=[n, n]) – The correlation matrix corresponding to the wald.

class geomstats.geometry.stratified.wald_space.WaldSpace(n_labels)[source]#

Bases: PointSet

Class for the Wald space, a metric space for phylogenetic forests.

Parameters:

n_labels (int) – Integer determining the number of labels in the forests, and thus the shape of the correlation matrices: n_labels x n_labels.

ambient#

The ambient space, the positive definite n_labels x n_labels matrices that the WaldSpace is embedded into.

belongs(point)[source]#

Check if a point wald belongs to Wald space.

From FUTURE PUBLICATION we know that the corresponding matrix of a wald is strictly positive definite if and only if the labels are separated by at least one edge, which is exactly when the wald is an element of the Wald space.

Parameters:

point (Wald or list of Wald) – The point to be checked.

Returns:

belongs (bool) – Boolean denoting if point belongs to Wald space.

random_point(n_samples=1, p_tree=0.9, p_keep=0.9, btol=1e-08)[source]#

Sample a random point in Wald space.

Parameters:
  • n_samples (int) – Number of samples. Defaults to 1.

  • p_tree (float between 0 and 1) – The probability that the sampled point is a tree, and not a forest. If the probability is equal to 1, then the sampled point will be a tree. Defaults to 0.9.

  • p_keep (float between 0 and 1) – The probability that a sampled edge is kept and not deleted randomly. To be precise, it is not exactly the probability, as some edges cannot be deleted since the requirement that two labels are separated by a split might be violated otherwise. Defaults to 0.9

  • btol (float) – Tolerance for the boundary of the coordinates in each grove. Defaults to 1e-08.

Returns:

samples (Wald or list of Wald, shape=[n_samples]) – Points sampled in Wald space.

set_to_array(points)[source]#

Convert a set of points into an array.

Parameters:

points (list of Wald, shape=[…]) – Number of samples of wälder to turn into an array.

Returns:

points_array (array-like, shape=[…]) – Array of the wälder that are turned into arrays.

Module contents#

The Stratified Space Geometry Package.