🌐 TopoBenchmarkX (TBX) 🍩#

TopoBenchmarkX (TBX) is a modular Python library designed to standardize benchmarking and accelerate research in Topological Deep Learning (TDL). In particular, TBX allows to train and compare the performances of all sorts of Topological Neural Networks (TNNs) across the different topological domains, where by topological domain we refer to a graph, a simplicial complex, a cellular complex, or a hypergraph.

📌 Overview#

The main pipeline trains and evaluates a wide range of state-of-the-art TNNs and Graph Neural Networks (GNNs) (see Neural Networks) on numerous and varied datasets and benchmark tasks (see Datasets).

Additionally, the library offers the ability to transform, i.e., lift, each dataset from one topological domain to another (see Liftings), enabling for the first time an exhaustive inter-domain comparison of TNNs.

⚙ Neural Networks#

We list the neural networks trained and evaluated by TopoBenchmarkX, organized by the topological domain over which they operate: graph, simplicial complex, cellular complex or hypergraph. Many of these neural networks were originally implemented in TopoModelX.

Graphs#

Model	Reference
GAT	Graph Attention Networks
GIN	How Powerful are Graph Neural Networks?
GCN	Semi-Supervised Classification with Graph Convolutional Networks

Simplicial complexes#

Model	Reference
SAN	Simplicial Attention Neural Networks
SCCN	Efficient Representation Learning for Higher-Order Data with Simplicial Complexes
SCCNN	Convolutional Learning on Simplicial Complexes
SCN	Simplicial Complex Neural Networks

Cellular complexes#

Model	Reference
CAN	Cell Attention Network
CCCN	A learning algorithm for computational connected cellular network
CXN	Cell Complex Neural Networks
CWN	Weisfeiler and Lehman Go Cellular: CW Networks

Hypergraphs#

Model	Reference
AllDeepSet	You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks
AllSetTransformer	You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks
EDGNN	Equivariant Hypergraph Diffusion Neural Operators
UniGNN	UniGNN: a Unified Framework for Graph and Hypergraph Neural Networks
UniGNN2	UniGNN: a Unified Framework for Graph and Hypergraph Neural Networks

🚀 Liftings#

We list the liftings used in TopoBenchmarkX to transform datasets. Here, a lifting refers to a function that transforms a dataset defined on a topological domain (e.g., on a graph) into the same dataset but supported on a different topological domain (e.g., on a simplicial complex).

Graph2Simplicial#

Name	Description	Reference
CliqueLifting	The algorithm finds the cliques in the graph and creates simplices. Given a clique the first simplex added is the one containing all the nodes of the clique, then the simplices composed of all the possible combinations with one node missing, then two nodes missing, and so on, until all the possible pairs are added. Then the method moves to the next clique.	Simplicial Complexes
KHopLifting	For each node in the graph, take the set of its neighbors, up to k distance, and the node itself. These sets are then treated as simplices. The dimension of each simplex depends on the degree of the nodes. For example, a node with d neighbors forms a d-simplex.	Neighborhood Complexes

Graph2Cell#

Name	Description	Reference
CellCycleLifting	To lift a graph to a cell complex (CC) we proceed as follows. First, we identify a finite set of cycles (closed loops) within the graph. Second, each identified cycle in the graph is associated to a 2-cell, such that the boundary of the 2-cell is the cycle. The nodes and edges of the cell complex are inherited from the graph.	Appendix B

Graph2Hypergraph#

Name	Description	Reference
KHopLifting	For each node in the graph, the algorithm finds the set of nodes that are at most k connections away from the initial node. This set is then used to create a hyperedge. The process is repeated for all nodes in the graph.	Section 3.4
KNearestNeighborsLifting	For each node in the graph, the method finds the k nearest nodes by using the Euclidean distance between the vectors of features. The set of k nodes found is considered as a hyperedge. The process is repeated for all nodes in the graph.	Section 3.1

📚 Datasets#

Dataset	Task	Description	Reference
Cora	Classification	Cocitation dataset.	Source
Citeseer	Classification	Cocitation dataset.	Source
Pubmed	Classification	Cocitation dataset.	Source
MUTAG	Classification	Graph-level classification.	Source
PROTEINS	Classification	Graph-level classification.	Source
NCI1	Classification	Graph-level classification.	Source
NCI109	Classification	Graph-level classification.	Source
IMDB-BIN	Classification	Graph-level classification.	Source
IMDB-MUL	Classification	Graph-level classification.	Source
REDDIT	Classification	Graph-level classification.	Source
Amazon	Classification	Heterophilic dataset.	Source
Minesweeper	Classification	Heterophilic dataset.	Source
Empire	Classification	Heterophilic dataset.	Source
Tolokers	Classification	Heterophilic dataset.	Source
US-county-demos	Regression	In turn each node attribute is used as the target label.	Source
ZINC	Regression	Graph-level regression.	Source

🔍 References#

To learn more about TopoBenchmarkX, we invite you to read the paper:

@misc{topobenchmarkx2024,
        title={TopoBenchmarkX},
        author={PyT-Team},
        year={2024},
        eprint={TBD},
        archivePrefix={arXiv},
        primaryClass={cs.LG}
}

If you find TopoBenchmarkX useful, we would appreciate if you cite us!

🦾 Getting Started#

Check out our tutorials to get started!