## Accelerating Graph Sampling for Graph Machine Learning using GPUs

Abhinav Jangda, Sandeep Polisetty, Arjun Guha, and Marco Serafini

*European Systems Conference (EuroSys)*, 2021

Representation learning algorithms automatically learn the features of data. In
modern machine learning, the learned representation is typically a deep neural
network (DNN). When the input is graph data, representation learning
algorithms, such as DeepWalk, node2vec, and GraphSAGE, have to first
*sample* the graph to produce input that is suitable for training a DNN.
However, sampling time can be a significant fraction of training time, and
existing graph analytics, mining, and representation learning systems do not
efficiently parallelize sampling.

Sampling is an “embarrassingly parallel” problem, and may appear to lend itself
to GPU acceleration but the irregularity of graphs makes it hard to use GPU
resources effectively. This paper presents NextDoor, a system designed to
effectively perform graph sampling on GPUs. NextDoor employs a new
*transit-parallelism* approach to graph sampling, which allows load
balancing and caching of edges. NextDoor provides end-users with a high-level
abstraction for writing a variety of graph sampling algorithms. We implement
several graph sampling applications, and show that NextDoor runs them orders of
magnitude faster than existing systems.