Base#

Cell Diff2Vec#

Class CellDiff2Vec.

class topoembedx.classes.cell_diff2vec.CellDiff2Vec(diffusion_number: int = 10, diffusion_cover: int = 80, dimensions: int = 128, workers: int = 4, window_size: int = 5, epochs: int = 1, use_hierarchical_softmax: bool = True, number_of_negative_samples: int = 5, learning_rate: float = 0.05, min_count: int = 1, seed: int = 42)[source]#

Class for CellDiff2Vec.

Parameters:

diffusion_numberint, default=10: Number of diffusions.
diffusion_coverint, default=80: Number of nodes in diffusion.
dimensionsint, default=128: Dimensionality of embedding.
workersint, default=4: Number of cores.
window_sizeint, default=5: Matrix power order.
epochsint, default=1: Number of epochs.
learning_ratefloat, default=0.05: HogWild! learning rate.
min_countint, optional: Minimal count of node occurrences.
seedint, default=42: Random seed value.

fit(complex: Complex, neighborhood_type: Literal['adj', 'coadj'] = 'adj', neighborhood_dim=None) → None[source]#

Fit a CellDiff2Vec model.

Parameters:

complextoponetx.classes.Complex: A complex object. The complex object can be one of the following: - CellComplex - CombinatorialComplex - PathComplex - SimplicialComplex - ColoredHyperGraph
neighborhood_type{“adj”, “coadj”}, default=”adj”: The type of neighborhood to compute. “adj” for adjacency matrix, “coadj” for coadjacency matrix.
neighborhood_dimdict: The integer parmaters needed to specify the neighborhood of the cells to generate the embedding. In TopoNetX (co)adjacency neighborhood matrices are specified via one or two parameters. - For Cell/Simplicial/Path complexes (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”]. - For Combinatorial/ColoredHyperGraph the (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”] and neighborhood_dim[“via_rank”].

Notes

Here neighborhood_dim={“rank”: 1, “via_rank”: -1} specifies the dimension for which the cell embeddings are going to be computed. “rank”: 1 means that the embeddings will be computed for the first dimension. The integer “via_rank”: -1 is ignored when the input is cell/simplicial complex and must be specified when the input complex is a combinatorial complex or colored hypergraph.

get_embedding(get_dict: bool = False) → dict | ndarray[source]#

Get embedding.

Parameters:

get_dictbool, optional: Whether to return a dictionary. Defaults to False.

Returns:

dict or numpy.ndarray: Embedding.

Cell2Vec#

Cell2Vec: a class that extends the Node2Vec class.

class topoembedx.classes.cell2vec.Cell2Vec(walk_number: int = 10, walk_length: int = 80, p: float = 1.0, q: float = 1.0, dimensions: int = 128, workers: int = 4, window_size: int = 5, epochs: int = 1, use_hierarchical_softmax: bool = True, number_of_negative_samples: int = 5, learning_rate: float = 0.05, min_count: int = 1, seed: int = 42)[source]#

Class Cell2Vec.

Cell2Vec is a class that extends the Node2Vec class. It provides additional functionality for generating node embeddings for simplicial, cell, combinatorial, or dynamic combinatorial complexes. The Cell2Vec class takes as input a simplicial, cell, combinatorial, or dynamic combinatorial complex, and uses the adjacency matrix or coadjacency matrix of the complex to create a graph object using the networkx library. The Cell2Vec class then uses this graph object to generate node embeddings using the Node2Vec algorithm. The Cell2Vec class allows users to specify the type of adjacency or coadjacency matrix to use for the graph (e.g. “adj” for adjacency matrix or “coadj” for coadjacency matrix), as well as the dimensions of the neighborhood to use for the matrix (e.g. the “adj” and “coadj” values for the matrix). Additionally, users can specify the dimensions of the node embeddings to generate, the length and number of random walks to use for the node2vec algorithm, and the number of workers to use for parallelization.

Parameters:

walk_numberint, default=10: Number of random walks to start at each node.
walk_lengthint, default=80: Length of random walks.
pfloat, default=1.0: Return parameter (1/p transition probability) to move towards from previous node.
qfloat, default=1.0: In-out parameter (1/q transition probability) to move away from previous node.
dimensionsint, default=128: Dimensionality of embedding.
workersint, default=4: Number of cores.
window_sizeint, default=5: Matrix power order.
epochsint, default=1: Number of epochs.
learning_ratefloat, default=0.05: HogWild! learning rate.
min_countint, optional: Minimal count of node occurrences.
seedint, default=42: Random seed value.

fit(complex: Complex, neighborhood_type: Literal['adj', 'coadj'] = 'adj', neighborhood_dim=None) → None[source]#

Fit a Cell2Vec model.

Parameters:

complextoponetx.classes.Complex: A complex object. The complex object can be one of the following: - CellComplex - CombinatorialComplex - PathComplex - SimplicialComplex - ColoredHyperGraph
neighborhood_type{“adj”, “coadj”}, default=”adj”: The type of neighborhood to compute. “adj” for adjacency matrix, “coadj” for coadjacency matrix.
neighborhood_dimdict: The integer parmaters needed to specify the neighborhood of the cells to generate the embedding. In TopoNetX (co)adjacency neighborhood matrices are specified via one or two parameters. - For Cell/Simplicial/Path complexes (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”]. - For Combinatorial/ColoredHyperGraph the (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”] and neighborhood_dim[“via_rank”].

Notes

get_embedding(get_dict: bool = False) → dict | ndarray[source]#

Get embedding.

Parameters:

get_dictbool, optional: Whether to return a dictionary. Defaults to False.

Returns:

dict or numpy.ndarray: Embedding.

Deep Cell#

DeepCell class for embedding complex networks using DeepWalk.

class topoembedx.classes.deepcell.DeepCell(walk_number: int = 10, walk_length: int = 80, dimensions: int = 128, workers: int = 4, window_size: int = 5, epochs: int = 1, use_hierarchical_softmax: bool = True, number_of_negative_samples: int = 5, learning_rate: float = 0.05, min_count: int = 1, seed: int = 42)[source]#

Class for DeepCell.

Parameters:

walk_numberint, default=10: Number of random walks to generate for each node.
walk_lengthint, default=80: Length of each random walk.
dimensionsint, default=128: Dimensionality of embedding.
workersint, default=4: Number of parallel workers to use for training.
window_sizeint, default=5: Size of the sliding window.
epochsint, default=1: Number of iterations (epochs).
learning_ratefloat, default=0.05: Learning rate for the model.
min_countint, optional: Minimum count of words to consider when training the model.
seedint, default=42: Random seed to use for reproducibility.

fit(complex: Complex, neighborhood_type: Literal['adj', 'coadj'] = 'adj', neighborhood_dim=None) → None[source]#

Fit the model.

Parameters:

complextoponetx.classes.Complex: A complex object. The complex object can be one of the following: - CellComplex - CombinatorialComplex - PathComplex - SimplicialComplex - ColoredHyperGraph
neighborhood_type{“adj”, “coadj”}, default=”adj”: The type of neighborhood to compute. “adj” for adjacency matrix, “coadj” for coadjacency matrix.
neighborhood_dimdict: The integer parmaters needed to specify the neighborhood of the cells to generate the embedding. In TopoNetX (co)adjacency neighborhood matrices are specified via one or two parameters. - For Cell/Simplicial/Path complexes (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”]. - For Combinatorial/ColoredHyperGraph the (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”] and neighborhood_dim[“via_rank”].

Notes

get_embedding(get_dict: bool = False) → dict | ndarray[source]#

Get embeddings.

Parameters:

get_dictbool, default=False: Return a dictionary of the embedding.

Returns:

dict or np.ndarray: The embedding of the complex.

Higher Order Laplacian Eigenmaps#

Higher Order Laplacian Eigenmaps.

class topoembedx.classes.higher_order_laplacian_eigenmaps.HigherOrderLaplacianEigenmaps(dimensions: int = 3, maximum_number_of_iterations: int = 100, seed: int = 42)[source]#

Class for Higher Order Laplacian Eigenmaps.

Parameters:

dimensionsint, default=3: Dimensionality of embedding.
maximum_number_of_iterationsint, default=100: Maximum number of iterations.
seedint, default=42: Random seed value.

fit(complex: Complex, neighborhood_type: Literal['adj', 'coadj'] = 'adj', neighborhood_dim=None) → None[source]#

Fit a Higher Order Laplacian Eigenmaps model.

Parameters:

complextoponetx.classes.Complex: A complex object. The complex object can be one of the following: - CellComplex - CombinatorialComplex - PathComplex - SimplicialComplex - ColoredHyperGraph
neighborhood_type{“adj”, “coadj”}, default=”adj”: The type of neighborhood to compute. “adj” for adjacency matrix, “coadj” for coadjacency matrix.
neighborhood_dimdict: The integer parmaters needed to specify the neighborhood of the cells to generate the embedding. In TopoNetX (co)adjacency neighborhood matrices are specified via one or two parameters. - For Cell/Simplicial/Path complexes (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”]. - For Combinatorial/ColoredHyperGraph the (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”] and neighborhood_dim[“via_rank”].

Notes

get_embedding(get_dict: bool = False) → dict | ndarray[source]#

Get embeddings.

Parameters:

get_dictbool, default=False: Whether to return a dictionary of the embedding.

Returns:

dict or np.ndarray: The embedding of the complex.

Hoglee#

Higher Order Geometric Laplacian EigenMaps (HOGLEE) class.

class topoembedx.classes.hoglee.HOGLEE(dimensions: int = 128, seed: int = 42)[source]#

Class for Higher Order Geometric Laplacian EigenMaps (HOGLEE).

Parameters:

dimensionsint, optional: Dimensionality of embedding. Defaults to 3.
seedint, optional: Random seed value. Defaults to 42.

fit(complex: Complex, neighborhood_type: Literal['adj', 'coadj'] = 'adj', neighborhood_dim=None) → None[source]#

Fit a Higher Order Geometric Laplacian EigenMaps model.

Parameters:

complextoponetx.classes.Complex: A complex object. The complex object can be one of the following: - CellComplex - CombinatorialComplex - PathComplex - SimplicialComplex - ColoredHyperGraph
neighborhood_type{“adj”, “coadj”}, default=”adj”: The type of neighborhood to compute. “adj” for adjacency matrix, “coadj” for coadjacency matrix.
neighborhood_dimdict: The integer parmaters needed to specify the neighborhood of the cells to generate the embedding. In TopoNetX (co)adjacency neighborhood matrices are specified via one or two parameters. - For Cell/Simplicial/Path complexes (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”]. - For Combinatorial/ColoredHyperGraph the (co)adjacency matrix is specified by a single parameter, this is precisely neighborhood_dim[“rank”] and neighborhood_dim[“via_rank”].

Notes

get_embedding(get_dict: bool = False) → dict | ndarray[source]#

Get embedding.

Parameters:

get_dictbool, optional: Whether to return a dictionary. Defaults to False.

Returns:

dict or numpy.ndarray: Embedding.

Hope#

Higher Order Laplacian Positional Encoder (HOPE) class.

class topoembedx.classes.hope.HOPE(dimensions: int = 3)[source]#

Higher Order Laplacian Positional Encoder (HOPE) class.

Parameters:

dimensionsint, default=3: Dimensionality of embedding.

fit(complex: Complex, neighborhood_type: Literal['adj', 'coadj'] = 'adj', neighborhood_dim: dict | None = None) → None[source]#

Fit a Higher Order Geometric Laplacian EigenMaps model.

Parameters:

complextoponetx.classes.Complex: A complex object. The complex object can be one of the following: - CellComplex - CombinatorialComplex - ColoredHyperGraph - SimplicialComplex - PathComplex
neighborhood_type{“adj”, “coadj”}, default=”adj”: The type of neighborhood to compute. “adj” for adjacency matrix, “coadj” for coadjacency matrix.
neighborhood_dimdict: The dimensions of the neighborhood to use. If neighborhood_type is “adj”, the dimension is neighborhood_dim[‘rank’]. If neighborhood_type is “coadj”, the dimension is neighborhood_dim[‘rank’] and neighborhood_dim[‘to_rank’] specifies the dimension of the ambient space.

Notes

Here, neighborhood_dim={“rank”: 1, “to_rank”: -1} specifies the dimension for which the cell embeddings are going to be computed. rank=1 means that the embeddings will be computed for the first dimension. The integer ‘to_rank’ is ignored and only considered when the input complex is a combinatorial complex.

Examples

>>> import toponetx as tnx
>>> from topoembedx import HOPE
>>> ccc = tnx.classes.CombinatorialComplex()
>>> ccc.add_cell([2, 5], rank=1)
>>> ccc.add_cell([2, 4], rank=1)
>>> ccc.add_cell([7, 8], rank=1)
>>> ccc.add_cell([6, 8], rank=1)
>>> ccc.add_cell([2, 4, 5], rank=3)
>>> ccc.add_cell([6, 7, 8], rank=3)

>>> model = HOPE()
>>> model.fit(
...     ccc,
...     neighborhood_type="adj",
...     neighborhood_dim={"rank": 0, "via_rank": 3},
... )
>>> em = model.get_embedding(get_dict=True)

get_embedding(get_dict: bool = False) → dict | ndarray[source]#

Get embedding.

Parameters:

get_dictbool, optional: Whether to return a dictionary. Defaults to False.

Returns:

dict or numpy.ndarray: Embedding.

Random Walks#

Generate random walks on a graph or network.

To generate node embeddings using Word2Vec, you can first use the random_walk function to generate random walks on your complex. Then, you can use the generated random walks as input to the Word2Vec algorithm to learn node embeddings.

Examples#

Here is an example of how you could use the random_walk function and Word2Vec to generate cell embeddings:

>>> # Import the necessary modules
>>> from gensim.models import Word2Vec
>>> # Generate random walks on your graph using the random_walk function
>>> random_walks = random_walk(length=10, num_walks=100, states=nodes, transition_matrix=transition_matrix)
>>> # Train a Word2Vec model on the generated random walks
>>> model = Word2Vec(random_walks, size=128, window=5, min_count=0, sg=1)
>>> # Use the trained model to generate node embeddings
>>> cell_embeddings = model.wv

In the example above, nodes is a list of the nodes in your complex, transition_matrix is the transition matrix of your complex, and cell_embeddings is a dictionary that maps each node in your complex to its corresponding embedding.

topoembedx.classes.random_walks.random_walk(length: int, num_walks: int, states: list[T], transition_matrix: ndarray) → list[list[T]][source]#

Generate random walks on a graph or network.

This function generates random walks of a given length on a given complex. The length and number of walks can be specified, as well as the cells (states) and transition matrix.

Parameters:

lengthint: The length of each random walk.
num_walksint: The number of random walks to generate.
stateslist: The nodes in the complex.
transition_matrixnumpy.ndarray: The transition matrix of the graph or network.

Returns:

list of list: The generated random walks.

Examples

>>> import numpy as np
>>> transition_matrix = np.array(
...     [
...         [0.0, 1.0, 0.0, 0.0],
...         [0.5, 0.0, 0.5, 0.0],
...         [0.0, 0.5, 0.0, 0.5],
...         [0.0, 0.0, 1.0, 0.0],
...     ]
... )
>>> states = ["A", "B", "C", "D"]
>>> walks = random_walk(
...     length=3, num_walks=2, states=states, transition_matrix=transition_matrix
... )
>>> print(walks)
[['B', 'C', 'D'], ['B', 'C', 'B']]

topoembedx.classes.random_walks.transition_from_adjacency(A: ndarray, sub_sampling: float = 0.1, self_loop: bool = True) → ndarray[source]#

Generate transition matrix from an adjacency matrix.

This function generates a transition matrix from an adjacency matrix using the following steps:

Add self-loop to the adjaency matrix if self_loop is set to True
Compute the degree matrix
Compute the transition matrix by taking the dot product of the inverse of the degree matrix and the adjacency matrix

Parameters:

Anumpy.ndarray: The adjacency matrix.
sub_samplingfloat, default=0.1: The rate of subsampling.
self_loopbool, default=True: A flag indicating whether to add self-loop to the adjacency matrix.

Returns:

numpy.ndarray: The transition matrix.

Examples

>>> import numpy as np
>>> A = np.array([[0, 1, 1, 0], [1, 0, 1, 0], [1, 1, 0, 1], [0, 0, 1, 0]])
>>> transition_from_adjacency(A)
array([[0.33333333, 0.33333333, 0.33333333, 0.        ],
       [0.33333333, 0.33333333, 0.33333333, 0.        ],
       [0.25      , 0.25      , 0.25      , 0.25      ],
       [0.        , 0.        , 0.5       , 0.5       ]])