🌐 TopoBenchmarkX (TBX) 🍩#
TopoBenchmarkX (TBX) is a modular Python library designed to standardize benchmarking and accelerate research in Topological Deep Learning (TDL). In particular, TBX allows to train and compare the performances of all sorts of Topological Neural Networks (TNNs) across the different topological domains, where by topological domain we refer to a graph, a simplicial complex, a cellular complex, or a hypergraph.
📌 Overview#
The main pipeline trains and evaluates a wide range of state-of-the-art TNNs and Graph Neural Networks (GNNs) (see Neural Networks) on numerous and varied datasets and benchmark tasks (see Datasets).
Additionally, the library offers the ability to transform, i.e., lift, each dataset from one topological domain to another (see Liftings), enabling for the first time an exhaustive inter-domain comparison of TNNs.
⚙ Neural Networks#
We list the neural networks trained and evaluated by TopoBenchmarkX, organized by the topological domain over which they operate: graph, simplicial complex, cellular complex or hypergraph. Many of these neural networks were originally implemented in TopoModelX.
Graphs#
Simplicial complexes#
Cellular complexes#
Hypergraphs#
🚀 Liftings#
We list the liftings used in TopoBenchmarkX to transform datasets. Here, a lifting refers to a function that transforms a dataset defined on a topological domain (e.g., on a graph) into the same dataset but supported on a different topological domain (e.g., on a simplicial complex).
Graph2Simplicial#
Name |
Description |
Reference |
---|---|---|
CliqueLifting |
The algorithm finds the cliques in the graph and creates simplices. Given a clique the first simplex added is the one containing all the nodes of the clique, then the simplices composed of all the possible combinations with one node missing, then two nodes missing, and so on, until all the possible pairs are added. Then the method moves to the next clique. |
|
KHopLifting |
For each node in the graph, take the set of its neighbors, up to k distance, and the node itself. These sets are then treated as simplices. The dimension of each simplex depends on the degree of the nodes. For example, a node with d neighbors forms a d-simplex. |
Graph2Cell#
Name |
Description |
Reference |
---|---|---|
CellCycleLifting |
To lift a graph to a cell complex (CC) we proceed as follows. First, we identify a finite set of cycles (closed loops) within the graph. Second, each identified cycle in the graph is associated to a 2-cell, such that the boundary of the 2-cell is the cycle. The nodes and edges of the cell complex are inherited from the graph. |
Graph2Hypergraph#
Name |
Description |
Reference |
---|---|---|
KHopLifting |
For each node in the graph, the algorithm finds the set of nodes that are at most k connections away from the initial node. This set is then used to create a hyperedge. The process is repeated for all nodes in the graph. |
|
KNearestNeighborsLifting |
For each node in the graph, the method finds the k nearest nodes by using the Euclidean distance between the vectors of features. The set of k nodes found is considered as a hyperedge. The process is repeated for all nodes in the graph. |
📚 Datasets#
Dataset |
Task |
Description |
Reference |
---|---|---|---|
Cora |
Classification |
Cocitation dataset. |
|
Citeseer |
Classification |
Cocitation dataset. |
|
Pubmed |
Classification |
Cocitation dataset. |
|
MUTAG |
Classification |
Graph-level classification. |
|
PROTEINS |
Classification |
Graph-level classification. |
|
NCI1 |
Classification |
Graph-level classification. |
|
NCI109 |
Classification |
Graph-level classification. |
|
IMDB-BIN |
Classification |
Graph-level classification. |
|
IMDB-MUL |
Classification |
Graph-level classification. |
|
Classification |
Graph-level classification. |
||
Amazon |
Classification |
Heterophilic dataset. |
|
Minesweeper |
Classification |
Heterophilic dataset. |
|
Empire |
Classification |
Heterophilic dataset. |
|
Tolokers |
Classification |
Heterophilic dataset. |
|
US-county-demos |
Regression |
In turn each node attribute is used as the target label. |
|
ZINC |
Regression |
Graph-level regression. |
🔍 References#
To learn more about TopoBenchmarkX, we invite you to read the paper:
@misc{topobenchmarkx2024,
title={TopoBenchmarkX},
author={PyT-Team},
year={2024},
eprint={TBD},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
If you find TopoBenchmarkX useful, we would appreciate if you cite us!
🦾 Getting Started#
Check out our tutorials to get started!