Abstract
We present the Open Graph Benchmark (OGB), a diverse set of challenging and
realistic benchmark datasets to facilitate scalable, robust, and reproducible
graph machine learning (ML) research. OGB datasets are large-scale, encompass
multiple important graph ML tasks, and cover a diverse range of domains,
ranging from social and information networks to biological networks, molecular
graphs, source code ASTs, and knowledge graphs. For each dataset, we provide a
unified evaluation protocol using meaningful application-specific data splits
and evaluation metrics. In addition to building the datasets, we also perform
extensive benchmark experiments for each dataset. Our experiments suggest that
OGB datasets present significant challenges of scalability to large-scale
graphs and out-of-distribution generalization under realistic data splits,
indicating fruitful opportunities for future research. Finally, OGB provides an
automated end-to-end graph ML pipeline that simplifies and standardizes the
process of graph data loading, experimental setup, and model evaluation. OGB
will be regularly updated and welcomes inputs from the community. OGB datasets
as well as data loaders, evaluation scripts, baseline code, and leaderboards
are publicly available at https://ogb.stanford.edu .
Users
Please
log in to take part in the discussion (add own reviews or comments).