Friday, September 13, 2024
HomeData ScienceHow Google’s GraphWorld solves Bottlenecks in Graph Neural Network Benchmarking?

# How Google’s GraphWorld solves Bottlenecks in Graph Neural Network Benchmarking?

GraphWorld can operate on a single system without any extra hardware, or it can be simply scaled out to run on any cluster or cloud framework.

Graph neural networks (GNNs) are a subtype of deep learning algorithms that operate on graphs. Graphs are data structures with edges and vertices. Depending on whether there are directional relationships between vertices, edges can be either directed or undirected. Although the terms vertices and nodes are sometimes used interchangeably, they represent two separate concepts. GNNs are considered an advancement over Convolutional Neural Networks (CNNs). CNNs are incapable of handling graph data because nodes in graphs are not represented in any order and dependence information between two nodes is provided by edges.

Neural networks are critical to the development of machine learning models as they help comprehend data patterns where conventional model training can fail. However, this technology generally caters to euclidean data, which encompasses the data in one or two-dimensional domains. E.g. audio, imagery, textual files.

Graph Neural Networks are distinct from traditional machine learning models as they analyze graph-structured data rather than euclidean data with a grid-like structure. Graph-structured data can also exist in a three-dimensional domain and is a prime example of non-euclidean data. Non-euclidean data ranges from complex molecular structures to traffic networks that require representation in a three-dimensional domain. If this form of data is translated into a euclidean framework, valuable information can be lost. In other words, a traditional neural network does not usually take into account the features of each vertex or edge in graph-structured data. Machine learning algorithms that prefer input in grid-like or rectangular arrays can limit the analysis they can do on graphs when the graphs exist in non-euclidean space, which means that nodes and edges cannot be represented using coordinates. Besides, when a graph structure is converted into its adjacency matrix, the resultant adjacency matrix can yield graphs with a broad range of appearances.

Since a graph structure can represent most objects, Graph Neural Networks offer a wide range of possible applications for non-euclidean data.

Thousands of GNN subtypes have been created annually due to a spike in interest in GNNs over the preceding few years. Methods and datasets for testing GNNs, on the other hand, have gotten significantly less attention. Many GNN articles employ the same 5–10 benchmark datasets, which are mostly made up of readily labeled academic citation networks and molecular datasets. This means that new GNN subtypes’ empirical performance can only be claimed for a small set of graphs. Recently published research with rigorous experimental designs raise suspicion about the performance rankings of prominent GNN models described in formative publications, further complicating the situation.

For instance, it is already mentioned that GNN task datasets are successively re-used throughout publications, as they are in many machine learning subfields, to correctly quantify incremental advances of new designs. However, as seen in NLP and computer vision applications, this can easily lead to the overfitting of novel structures to datasets over time. If the primary collection of benchmark graphs has comparable structural and statistical qualities, the effect will be amplified.

To address these bottlenecks, Stanford recently unveiled Open Graph Benchmark (OGB), which is an open-source software for assessing GNNs on a handful of massive-scale graph datasets across a range of tasks, allowing for a more uniform GNN experimental design. However, as current datasets, Open Graph Benchmark was sourced from many of the same domains, implying its failure in tackling the above-mentioned dataset variety issue.

The Open Graph Benchmark raised the number of nodes in experiment-friendly benchmark citation graphs by more than 1,000 times. From one point of view, this is entirely natural as computational capabilities improve and graph-based learning problems become more data-rich. However, while the availability of enormous graphs is critical for evaluating GNN software, platforms, and model complexity, giant graphs are not required to verify GNN accuracy or scientific relevance. Standardized graph datasets for assessing GNN expressiveness become less accessible to the typical researcher as the field’s benchmark graphs grow in size.

Furthermore, without access to institution-scale computer resources, investigating GNN hyperparameter tuning approaches or training variance is almost impossible with big benchmark datasets.

In “GraphWorld: Fake Graphs Bring Real Insights for GNNs,” Google proposes a framework for measuring the performance of GNN architectures on millions of synthetic benchmark datasets to match the volume and pace of GNN development. Google recommends GraphWorld as a complementary GNN benchmark that allows academics to investigate GNN performance in portions of graph space that are not covered by popular academic datasets. This is primarily because, Google believes that while “GNN benchmark datasets featured in the academic literature are just individual locations on a fully-diverse world of potential graphs, GraphWorld directly generates this world using probability models, tests GNN models at every location on it, and extracts generalizable insights from the results.”

To highlight the inspiration behind GraphWorld, researchers compare Open Graph Benchmark graphs, to a much larger collection (5,000+) of graphs from the Network Repository. While most Network Repository graphs are unlabeled and so cannot be used in normal GNN experiments, the authors have found that they represent many graphs that exist in the actual world. They calculated the clustering coefficient (how coupled nodes are to adjacent neighbors) and the degree distribution Gini coefficient (the inequality among nodes’ connection counts) for the OGB and Network Repository graphs. The Google team discovered that OGB datasets exist in a small and sparsely populated portion of this metric space.

When utilizing GraphWorld to explore GNN performance on a certain job, the researcher first selects a parameterized generator (example below) that could generate graphical datasets for stress-testing GNN models on the task. A generator parameter is an input that influences the output dataset’s high-level properties. GraphWorld employs parameterized generators to build populations of graph datasets that are adequately variegated to put state-of-the-art GNN models to the test. GraphWorld creates a string of GNN benchmark datasets by sampling the generator parameter values using parallel computing (e.g., Google Cloud Platform Dataflow). GraphWorld evaluates an arbitrary number of GNN models (selected by the user, e.g., GCN, GAT, GraphSAGE) on each dataset at the same time, and then produces a large tabular dataset that combines graph attributes with GNN performance results.

Google researchers outline GraphWorld pipelines for node classification, link prediction, and graph classification tasks, each with its own dataset generator in their paper. They noticed that each pipeline required less time and computer resources than state-of-the-art experimentation on OGB graphs, implying that GraphWorld is affordable to researchers on a tight budget.

GraphWorld, according to Google, is cost-effective, as it can execute hundreds of thousands of GNN tests on synthetic data for less than the cost of one experiment on an extensive OGB dataset.