Base#

Cell Diff2Vec#

Class CellDiff2Vec.

class topoembedx.classes.cell_diff2vec.CellDiff2Vec(diffusion_number: int = 10, diffusion_cover: int = 80, dimensions: int = 128, workers: int = 4, window_size: int = 5, epochs: int = 1, use_hierarchical_softmax: bool = True, number_of_negative_samples: int = 5, learning_rate: float = 0.05, min_count: int = 1, seed: int = 42)[source]#

Class for CellDiff2Vec.

Parameters:
diffusion_numberint, default=10

Number of diffusions.

diffusion_coverint, default=80

Number of nodes in diffusion.

dimensionsint, default=128

Dimensionality of embedding.

workersint, default=4

Number of cores.

window_sizeint, default=5

Matrix power order.

epochsint, default=1

Number of epochs.

use_hierarchical_softmaxbool, default=True

Whether to use hierarchical softmax or negative sampling to train the model.

number_of_negative_samplesint, default=5

Number of negative nodes to sample (usually between 5-20). If set to 0, no negative sampling is used.

learning_ratefloat, default=0.05

HogWild! learning rate.

min_countint, optional

Minimal count of node occurrences.

seedint, default=42

Random seed value.

Methods

fit(complex[, neighborhood_type, ...])

Fit a CellDiff2Vec model.

get_cluster_centers()

Getting the cluster centers.

get_embedding([get_dict])

Get embedding.

get_memberships()

Getting the membership dictionary.

get_params()

Get parameter dictionary for this estimator..

set_params(**parameters)

Set the parameters of this estimator.

fit(complex: Complex, neighborhood_type: Literal['adj', 'coadj'] = 'adj', neighborhood_dim=None) None[source]#

Fit a CellDiff2Vec model.

Parameters:
complextoponetx.classes.Complex

A complex object. The complex object can be one of the following: - CellComplex - CombinatorialComplex - PathComplex - SimplicialComplex - ColoredHyperGraph

neighborhood_type{“adj”, “coadj”}, default=”adj”

The type of neighborhood to compute. “adj” for adjacency matrix, “coadj” for coadjacency matrix.

neighborhood_dimdict

The integer parmaters needed to specify the neighborhood of the cells to generate the embedding. In TopoNetX (co)adjacency neighborhood matrices are specified via one or two parameters. - For Cell/Simplicial/Path complexes (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”]. - For Combinatorial/ColoredHyperGraph the (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”] and neighborhood_dim[“via_rank”].

Notes

Here neighborhood_dim={“rank”: 1, “via_rank”: -1} specifies the dimension for which the cell embeddings are going to be computed. “rank”: 1 means that the embeddings will be computed for the first dimension. The integer “via_rank”: -1 is ignored when the input is cell/simplicial complex and must be specified when the input complex is a combinatorial complex or colored hypergraph.

get_embedding(get_dict: bool = False) dict | ndarray[source]#

Get embedding.

Parameters:
get_dictbool, optional

Whether to return a dictionary. Defaults to False.

Returns:
dict or numpy.ndarray

Embedding.

Cell2Vec#

Cell2Vec: a class that extends the Node2Vec class.

class topoembedx.classes.cell2vec.Cell2Vec(walk_number: int = 10, walk_length: int = 80, p: float = 1.0, q: float = 1.0, dimensions: int = 128, workers: int = 4, window_size: int = 5, epochs: int = 1, use_hierarchical_softmax: bool = True, number_of_negative_samples: int = 5, learning_rate: float = 0.05, min_count: int = 1, seed: int = 42)[source]#

Class Cell2Vec.

Cell2Vec is a class that extends the Node2Vec class. It provides additional functionality for generating node embeddings for simplicial, cell, combinatorial, or dynamic combinatorial complexes. The Cell2Vec class takes as input a simplicial, cell, combinatorial, or dynamic combinatorial complex, and uses the adjacency matrix or coadjacency matrix of the complex to create a graph object using the networkx library. The Cell2Vec class then uses this graph object to generate node embeddings using the Node2Vec algorithm. The Cell2Vec class allows users to specify the type of adjacency or coadjacency matrix to use for the graph (e.g. “adj” for adjacency matrix or “coadj” for coadjacency matrix), as well as the dimensions of the neighborhood to use for the matrix (e.g. the “adj” and “coadj” values for the matrix).

Parameters:
walk_numberint, default=10

Number of random walks to start at each node.

walk_lengthint, default=80

Length of random walks.

pfloat, default=1.0

Return parameter (1/p transition probability) to move towards from previous node.

qfloat, default=1.0

In-out parameter (1/q transition probability) to move away from previous node.

dimensionsint, default=128

Dimensionality of embedding.

workersint, default=4

Number of cores.

window_sizeint, default=5

Matrix power order.

epochsint, default=1

Number of epochs.

use_hierarchical_softmaxbool, default=True

Whether to use hierarchical softmax or negative sampling to train the model.

number_of_negative_samplesint, default=5

Number of negative nodes to sample (usually between 5-20). If set to 0, no negative sampling is used.

learning_ratefloat, default=0.05

HogWild! learning rate.

min_countint, optional

Minimal count of node occurrences.

seedint, default=42

Random seed value.

Methods

fit(complex[, neighborhood_type, ...])

Fit a Cell2Vec model.

get_cluster_centers()

Getting the cluster centers.

get_embedding([get_dict])

Get embedding.

get_memberships()

Getting the membership dictionary.

get_params()

Get parameter dictionary for this estimator..

set_params(**parameters)

Set the parameters of this estimator.

fit(complex: Complex, neighborhood_type: Literal['adj', 'coadj'] = 'adj', neighborhood_dim=None) None[source]#

Fit a Cell2Vec model.

Parameters:
complextoponetx.classes.Complex

A complex object. The complex object can be one of the following: - CellComplex - CombinatorialComplex - PathComplex - SimplicialComplex - ColoredHyperGraph

neighborhood_type{“adj”, “coadj”}, default=”adj”

The type of neighborhood to compute. “adj” for adjacency matrix, “coadj” for coadjacency matrix.

neighborhood_dimdict

The integer parmaters needed to specify the neighborhood of the cells to generate the embedding. In TopoNetX (co)adjacency neighborhood matrices are specified via one or two parameters. - For Cell/Simplicial/Path complexes (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”]. - For Combinatorial/ColoredHyperGraph the (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”] and neighborhood_dim[“via_rank”].

Notes

Here neighborhood_dim={“rank”: 1, “via_rank”: -1} specifies the dimension for which the cell embeddings are going to be computed. “rank”: 1 means that the embeddings will be computed for the first dimension. The integer “via_rank”: -1 is ignored when the input is cell/simplicial complex and must be specified when the input complex is a combinatorial complex or colored hypergraph.

get_embedding(get_dict: bool = False) dict | ndarray[source]#

Get embedding.

Parameters:
get_dictbool, optional

Whether to return a dictionary. Defaults to False.

Returns:
dict or numpy.ndarray

Embedding.

Deep Cell#

DeepCell class for embedding complex networks using DeepWalk.

class topoembedx.classes.deepcell.DeepCell(walk_number: int = 10, walk_length: int = 80, dimensions: int = 128, workers: int = 4, window_size: int = 5, epochs: int = 1, use_hierarchical_softmax: bool = True, number_of_negative_samples: int = 5, learning_rate: float = 0.05, min_count: int = 1, seed: int = 42)[source]#

Class for DeepCell.

Parameters:
walk_numberint, default=10

Number of random walks to generate for each node.

walk_lengthint, default=80

Length of each random walk.

dimensionsint, default=128

Dimensionality of embedding.

workersint, default=4

Number of parallel workers to use for training.

window_sizeint, default=5

Size of the sliding window.

epochsint, default=1

Number of iterations (epochs).

use_hierarchical_softmaxbool, default=True

Whether to use hierarchical softmax or negative sampling for training.

number_of_negative_samplesint, default=5

Number of negative samples to use for negative sampling.

learning_ratefloat, default=0.05

Learning rate for the model.

min_countint, optional

Minimum count of words to consider when training the model.

seedint, default=42

Random seed to use for reproducibility.

Methods

fit(complex[, neighborhood_type, ...])

Fit the model.

get_cluster_centers()

Getting the cluster centers.

get_embedding([get_dict])

Get embeddings.

get_memberships()

Getting the membership dictionary.

get_params()

Get parameter dictionary for this estimator..

set_params(**parameters)

Set the parameters of this estimator.

fit(complex: Complex, neighborhood_type: Literal['adj', 'coadj'] = 'adj', neighborhood_dim=None) None[source]#

Fit the model.

Parameters:
complextoponetx.classes.Complex

A complex object. The complex object can be one of the following: - CellComplex - CombinatorialComplex - PathComplex - SimplicialComplex - ColoredHyperGraph

neighborhood_type{“adj”, “coadj”}, default=”adj”

The type of neighborhood to compute. “adj” for adjacency matrix, “coadj” for coadjacency matrix.

neighborhood_dimdict

The integer parmaters needed to specify the neighborhood of the cells to generate the embedding. In TopoNetX (co)adjacency neighborhood matrices are specified via one or two parameters. - For Cell/Simplicial/Path complexes (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”]. - For Combinatorial/ColoredHyperGraph the (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”] and neighborhood_dim[“via_rank”].

Notes

Here neighborhood_dim={“rank”: 1, “via_rank”: -1} specifies the dimension for which the cell embeddings are going to be computed. “rank”: 1 means that the embeddings will be computed for the first dimension. The integer “via_rank”: -1 is ignored when the input is cell/simplicial complex and must be specified when the input complex is a combinatorial complex or colored hypergraph.

get_embedding(get_dict: bool = False) dict | ndarray[source]#

Get embeddings.

Parameters:
get_dictbool, default=False

Return a dictionary of the embedding.

Returns:
dict or np.ndarray

The embedding of the complex.

Higher Order Laplacian Eigenmaps#

Higher Order Laplacian Eigenmaps.

class topoembedx.classes.higher_order_laplacian_eigenmaps.HigherOrderLaplacianEigenmaps(dimensions: int = 3, maximum_number_of_iterations: int = 100, seed: int = 42)[source]#

Class for Higher Order Laplacian Eigenmaps.

Parameters:
dimensionsint, default=3

Dimensionality of embedding.

maximum_number_of_iterationsint, default=100

Maximum number of iterations.

seedint, default=42

Random seed value.

Methods

fit(complex[, neighborhood_type, ...])

Fit a Higher Order Laplacian Eigenmaps model.

get_cluster_centers()

Getting the cluster centers.

get_embedding([get_dict])

Get embeddings.

get_memberships()

Getting the membership dictionary.

get_params()

Get parameter dictionary for this estimator..

set_params(**parameters)

Set the parameters of this estimator.

fit(complex: Complex, neighborhood_type: Literal['adj', 'coadj'] = 'adj', neighborhood_dim=None) None[source]#

Fit a Higher Order Laplacian Eigenmaps model.

Parameters:
complextoponetx.classes.Complex

A complex object. The complex object can be one of the following: - CellComplex - CombinatorialComplex - PathComplex - SimplicialComplex - ColoredHyperGraph

neighborhood_type{“adj”, “coadj”}, default=”adj”

The type of neighborhood to compute. “adj” for adjacency matrix, “coadj” for coadjacency matrix.

neighborhood_dimdict

The integer parmaters needed to specify the neighborhood of the cells to generate the embedding. In TopoNetX (co)adjacency neighborhood matrices are specified via one or two parameters. - For Cell/Simplicial/Path complexes (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”]. - For Combinatorial/ColoredHyperGraph the (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”] and neighborhood_dim[“via_rank”].

Notes

Here neighborhood_dim={“rank”: 1, “via_rank”: -1} specifies the dimension for which the cell embeddings are going to be computed. “rank”: 1 means that the embeddings will be computed for the first dimension. The integer “via_rank”: -1 is ignored when the input is cell/simplicial complex and must be specified when the input complex is a combinatorial complex or colored hypergraph.

get_embedding(get_dict: bool = False) dict | ndarray[source]#

Get embeddings.

Parameters:
get_dictbool, default=False

Whether to return a dictionary of the embedding.

Returns:
dict or np.ndarray

The embedding of the complex.

Hoglee#

Higher Order Geometric Laplacian EigenMaps (HOGLEE) class.

class topoembedx.classes.hoglee.HOGLEE(dimensions: int = 128, seed: int = 42)[source]#

Class for Higher Order Geometric Laplacian EigenMaps (HOGLEE).

Parameters:
dimensionsint, default=128

Dimensionality of embedding.

seedint, default=42

Random seed value.

Methods

fit(complex[, neighborhood_type, ...])

Fit a Higher Order Geometric Laplacian EigenMaps model.

get_cluster_centers()

Getting the cluster centers.

get_embedding([get_dict])

Get embedding.

get_memberships()

Getting the membership dictionary.

get_params()

Get parameter dictionary for this estimator..

set_params(**parameters)

Set the parameters of this estimator.

fit(complex: Complex, neighborhood_type: Literal['adj', 'coadj'] = 'adj', neighborhood_dim=None) None[source]#

Fit a Higher Order Geometric Laplacian EigenMaps model.

Parameters:
complextoponetx.classes.Complex

A complex object. The complex object can be one of the following: - CellComplex - CombinatorialComplex - PathComplex - SimplicialComplex - ColoredHyperGraph

neighborhood_type{“adj”, “coadj”}, default=”adj”

The type of neighborhood to compute. “adj” for adjacency matrix, “coadj” for coadjacency matrix.

neighborhood_dimdict

The integer parmaters needed to specify the neighborhood of the cells to generate the embedding. In TopoNetX (co)adjacency neighborhood matrices are specified via one or two parameters. - For Cell/Simplicial/Path complexes (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”]. - For Combinatorial/ColoredHyperGraph the (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”] and neighborhood_dim[“via_rank”].

Notes

Here neighborhood_dim={“rank”: 1, “via_rank”: -1} specifies the dimension for which the cell embeddings are going to be computed. “rank”: 1 means that the embeddings will be computed for the first dimension. The integer “via_rank”: -1 is ignored when the input is cell/simplicial complex and must be specified when the input complex is a combinatorial complex or colored hypergraph.

get_embedding(get_dict: bool = False) dict | ndarray[source]#

Get embedding.

Parameters:
get_dictbool, optional

Whether to return a dictionary. Defaults to False.

Returns:
dict or numpy.ndarray

Embedding.

Hope#

Higher Order Laplacian Positional Encoder (HOPE) class.

class topoembedx.classes.hope.HOPE(dimensions: int = 3)[source]#

Higher Order Laplacian Positional Encoder (HOPE) class.

Parameters:
dimensionsint, default=3

Dimensionality of embedding.

Methods

fit(complex[, neighborhood_type, ...])

Fit a Higher Order Geometric Laplacian EigenMaps model.

get_embedding([get_dict])

Get embedding.

fit(complex: Complex, neighborhood_type: Literal['adj', 'coadj'] = 'adj', neighborhood_dim: dict | None = None) None[source]#

Fit a Higher Order Geometric Laplacian EigenMaps model.

Parameters:
complextoponetx.classes.Complex

A complex object. The complex object can be one of the following: - CellComplex - CombinatorialComplex - ColoredHyperGraph - SimplicialComplex - PathComplex

neighborhood_type{“adj”, “coadj”}, default=”adj”

The type of neighborhood to compute. “adj” for adjacency matrix, “coadj” for coadjacency matrix.

neighborhood_dimdict

The dimensions of the neighborhood to use. If neighborhood_type is “adj”, the dimension is neighborhood_dim[‘rank’]. If neighborhood_type is “coadj”, the dimension is neighborhood_dim[‘rank’] and neighborhood_dim[‘to_rank’] specifies the dimension of the ambient space.

Notes

Here, neighborhood_dim={“rank”: 1, “to_rank”: -1} specifies the dimension for which the cell embeddings are going to be computed. rank=1 means that the embeddings will be computed for the first dimension. The integer ‘to_rank’ is ignored and only considered when the input complex is a combinatorial complex.

Examples

>>> import toponetx as tnx
>>> from topoembedx import HOPE
>>> ccc = tnx.classes.CombinatorialComplex()
>>> ccc.add_cell([2, 5], rank=1)
>>> ccc.add_cell([2, 4], rank=1)
>>> ccc.add_cell([7, 8], rank=1)
>>> ccc.add_cell([6, 8], rank=1)
>>> ccc.add_cell([2, 4, 5], rank=3)
>>> ccc.add_cell([6, 7, 8], rank=3)
>>> model = HOPE()
>>> model.fit(
...     ccc,
...     neighborhood_type="adj",
...     neighborhood_dim={"rank": 0, "via_rank": 3},
... )
>>> em = model.get_embedding(get_dict=True)
get_embedding(get_dict: bool = False) dict | ndarray[source]#

Get embedding.

Parameters:
get_dictbool, optional

Whether to return a dictionary. Defaults to False.

Returns:
dict or numpy.ndarray

Embedding.

Random Walks#

Generate random walks on a graph or network.

To generate node embeddings using Word2Vec, you can first use the random_walk function to generate random walks on your complex. Then, you can use the generated random walks as input to the Word2Vec algorithm to learn node embeddings.

Examples#

Here is an example of how you could use the random_walk function and Word2Vec to generate cell embeddings:

>>> # Import the necessary modules
>>> from gensim.models import Word2Vec
>>> # Generate random walks on your graph using the random_walk function
>>> random_walks = random_walk(
...     length=10, num_walks=100, states=nodes, transition_matrix=transition_matrix
... )
>>> # Train a Word2Vec model on the generated random walks
>>> model = Word2Vec(random_walks, size=128, window=5, min_count=0, sg=1)
>>> # Use the trained model to generate node embeddings
>>> cell_embeddings = model.wv

In the example above, nodes is a list of the nodes in your complex, transition_matrix is the transition matrix of your complex, and cell_embeddings is a dictionary that maps each node in your complex to its corresponding embedding.

topoembedx.classes.random_walks.random_walk(length: int, num_walks: int, states: list[T], transition_matrix: ndarray) list[list[T]][source]#

Generate random walks on a graph or network.

This function generates random walks of a given length on a given complex. The length and number of walks can be specified, as well as the cells (states) and transition matrix.

Parameters:
lengthint

The length of each random walk.

num_walksint

The number of random walks to generate.

stateslist

The nodes in the complex.

transition_matrixnumpy.ndarray

The transition matrix of the graph or network.

Returns:
list of list

The generated random walks.

Examples

>>> import numpy as np
>>> transition_matrix = np.array(
...     [
...         [0.0, 1.0, 0.0, 0.0],
...         [0.5, 0.0, 0.5, 0.0],
...         [0.0, 0.5, 0.0, 0.5],
...         [0.0, 0.0, 1.0, 0.0],
...     ]
... )
>>> states = ["A", "B", "C", "D"]
>>> walks = random_walk(
...     length=3, num_walks=2, states=states, transition_matrix=transition_matrix
... )
>>> print(walks)
[['B', 'C', 'D'], ['B', 'C', 'B']]
topoembedx.classes.random_walks.transition_from_adjacency(A: ndarray, sub_sampling: float = 0.1, self_loop: bool = True) ndarray[source]#

Generate transition matrix from an adjacency matrix.

This function generates a transition matrix from an adjacency matrix using the following steps:

  1. Add self-loop to the adjaency matrix if self_loop is set to True

  2. Compute the degree matrix

  3. Compute the transition matrix by taking the dot product of the inverse of the degree matrix and the adjacency matrix

Parameters:
Anumpy.ndarray

The adjacency matrix.

sub_samplingfloat, default=0.1

The rate of subsampling.

self_loopbool, default=True

A flag indicating whether to add self-loop to the adjacency matrix.

Returns:
numpy.ndarray

The transition matrix.

Examples

>>> import numpy as np
>>> A = np.array([[0, 1, 1, 0], [1, 0, 1, 0], [1, 1, 0, 1], [0, 0, 1, 0]])
>>> transition_from_adjacency(A)
array([[0.33333333, 0.33333333, 0.33333333, 0.        ],
       [0.33333333, 0.33333333, 0.33333333, 0.        ],
       [0.25      , 0.25      , 0.25      , 0.25      ],
       [0.        , 0.        , 0.5       , 0.5       ]])