Datasets#

Utils#

Utils loading datasets.

toponetx.datasets.utils.load_ppi()[source]#

Load the protein-protein-interaction graph.

In the graph, high interaction score is represented as low weight.

Returns:

networkx.Graph: Graph with nodes that are proteins and edges corresponding to their chemical interactions.

References

https://towardsdatascience.com/visualizing-protein-networks-in-python-58a9b51be9d5

Graph#

Various examples of named graphs represented as complexes.

toponetx.datasets.graph.coauthorship() → SimplicialComplex[source]#

Load the coauthorship network as a simplicial complex.

The coauthorship network is a simplicial complex where a paper with k authors is represented by a (k-1)-simplex. The dataset is pre-processed as in [1]. From the Semantic Scholar Open Research Corpus 80 papers with number of citations between 5 and 10 were sampled.

The papers constitute simplices in the complex, which is completed with sub-simplices (seen as collaborations between subsets of authors) to form a simplicial complex. An attribute named citations is added to each simplex, corresponding to the sum of citations of all papers on which the authors represented by the simplex collaborated. The resulting simplicial complex is of dimension 10 and contains 24552 simplices in total. See [1] for a more detailed description of the dataset.

Returns:

SimplicialComplex: The simplicial complex comes with the attribute citations, the number of citations attributed to the given collaborations of k authors.

References

[1] (1,2)

Stefania Ebli, Michael Defferrard and Gard Spreemann. Simplicial Neural Networks. Topological Data Analysis and Beyond workshop at NeurIPS. https://arxiv.org/abs/2010.03633

toponetx.datasets.graph.karate_club(complex_type: Literal['cell'], feat_dim: int = 2) → CellComplex[source]#

toponetx.datasets.graph.karate_club(complex_type: Literal['simplicial'], feat_dim: int = 2) → SimplicialComplex

Load the karate club as featured cell/simplicial complex.

Parameters:

complex_type{‘simplicial’,’cell’}: The type of complex to load.
feat_dimint, default=2: The number of eigenvectors to be attached to the simplices/cells of the output complex.

Returns:

SimplicialComplex or CellComplex

When input is “simplicial”: A SimplicialComplex obtained from karate club graph by lifting the graph to its clique complex. The simplicial complex comes with the following features:

“node_feat”: its value is the first feat_dim Hodge Laplacian eigenvectors attached to nodes.
“edge_feat”: its value is the first feat_dim Hodge Laplacian eigenvectors attached to edges.
“face_feat”: its value is the first feat_dim Hodge Laplacian eigenvectors attached to faces.
“tetrahedron_feat”: the first feat_dim Hodge Laplacian eigenvectors attached to tetrahedron.

When input is “cell”: A CellComplex obtained from karate club by lifting the graph to a cell obtained obtained from the graph by adding the independent homology cycles in the graph. The cell complex comes with the following features:

“node_feat”: its value is the first feat_dim Hodge Laplacian eigenvectors attached to nodes.
“edge_feat”: its value is the first feat_dim Hodge Laplacian eigenvectors attached to edges.
“cell_feat”: its value is the first feat_dim Hodge Laplacian eigenvectors attached to cells.

Raises:

ValueError: If complex_type is not one of the supported values.

Notes

A featured simplicial complex is returned as the clique complex of the graph. A featured cell complex is returned as the cell complex obtained by adding the independent cycles of graph.

Mesh#

Various examples of named meshes represented as complexes.

toponetx.datasets.mesh.coseg(data: Literal['alien', 'vase', 'chair'] = 'alien')[source]#

Load coseg mesh segmentation datasets.

The coseg dataset was downloaded and processed from the repo: Ideas-Laboratory/shape-coseg-dataset

Parameters:

data{‘alien’, ‘vase’, ‘chair’}, default=’alien’: The name of the coseg dataset to be loaded. Options are ‘alien’, ‘vase’, or ‘chair’.

Returns:

npz file: The npz files store the complexes of coseg segmentation dataset along with their nodes, edges, and faces features.

Raises:

RuntimeError: If the dataset is not found in DIR.

Notes

Each npz file stores 5 keys: “complexes”, “label”, “node_feat”, “edge_feat”, and “face_feat”. complex : stores the simplicial complex of the mesh node_feat : stores a 6-dimensional node feature vector: position and normal of each node in the mesh edge_feat : stores a 10-dimensional edge feature vector: dihedral angle, edge span, 2 edge angles in the triangle, 6 edge ratios. face_feat : stores a 7-dimensional face feature vector: face area (1 feat), face normal (3 feat), face angles (3 feat) face_label : stores the label of mesh segmentation as a face label

Examples

>>> coseg_data = coseg("alien")
>>> complexes = coseg_data["complexes"]
>>> node_feat = coseg_data["node_feat"]
>>> edge_feat = coseg_data["edge_feat"]
>>> face_feat = coseg_data["face_feat"]
>>> face_label = coseg_data["face_label"]

toponetx.datasets.mesh.shrec_16(size: Literal['full', 'small'] = 'full')[source]#

Load training/testing shrec 16 datasets.

Parameters:

size{‘full’, ‘small’}, default=’full’: Dataset size. Options are “full” or “small”.

Returns:

tuple of length 2 npz files: The npz files store the training/testing complexes of shrec 16 dataset along with their nodes, edges and faces features.

Raises:

RuntimeError: If dataset is not found on in DIR.

Notes

Each npz file stores 5 keys: “complexes”,”node_feat”,”edge_feat”, “face_feat” and mesh label”. complex : stores the simplicial complex of the mesh node_feat : stores a 6 dim node feature vector: position and normal of the each node in the mesh edge_feat : stores a 10 dim edge feature vector: dihedral angle, edge span, 2 edge angle in the triangle, 6 edge ratios. face_feat : stores a 7-dimensional face feature vector: face area (1 feat), face normal (3 feat), face angles (3 feat) mesh label : stores the label of the mesh

Examples

>>> shrec_training, shrec_testing = shrec_16()
>>> # training dataset
>>> training_complexes = shrec_training["complexes"]
>>> training_labels = shrec_training["label"]
>>> training_node_feat = shrec_training["node_feat"]
>>> training_edge_feat = shrec_training["edge_feat"]
>>> training_face_feat = shrec_training["face_feat"]
>>> # testing dataset
>>> testing_complexes = shrec_testing["complexes"]
>>> testing_labels = shrec_testing["label"]
>>> testing_node_feat = shrec_testing["node_feat"]
>>> testing_edge_feat = shrec_testing["edge_feat"]
>>> testing_face_feat = shrec_testing["face_feat"]

toponetx.datasets.mesh.stanford_bunny(complex_type: Literal['cell']) → CellComplex[source]#

toponetx.datasets.mesh.stanford_bunny(complex_type: Literal['simplicial']) → SimplicialComplex

Load the Stanford Bunny mesh as a complex.

Parameters:

complex_type{‘cell’, ‘simplicial’}: The type of complex to load. Supported values are “simplicial complex” and “cell complex”. The default is “simplicial complex”.

Returns:

SimplicialComplex or CellComplex: The loaded complex of the specified type.

Raises:

ValueError: If complex_type is not one of the supported values.