This blog post introduces a project I and my teammates developed as part of UBC's CPSC 416: Distributed Systems course. Our project BAG:EL is a distributed graph processor tool based on the Pregel API, and it supports node churn when running distributed graph queries.
Distributed graph processing involves representing and interacting with large graphs across multiple machines. The Pregel-based approach for graph processing is a vertex-based implementation of the Bulk Synchronous Parallel model. A great explanation of Pregel's approach can be found here. In summary, the Pregel approach defines the notion of a set of supersteps to represent a distributed graph query. In each superstep, multiple processes execute a query on different sets of vertices of the graph concurrently. Each vertex may send messages to other vertices in the graph to be processed at the next superstep. When a vertex has no messages to send at the end of a superstep, it votes to halt. When all vertices that belong to a process vote to halt, the process also halts. This continues until all processes that execute the distributed query across different machines vote to halt.
What We Built
Our API supports two distributed graph processing algorithms: page rank and shortest path between two vertices. For the underlying graph, we used Google’s web graph dataset from 2002. The dataset contains 875,713 vertices and 5,105,039 edges (with file size = 75.4 MB). Each vertex consists of a vertex identifier and a list of outgoing edges. Our system consists of 3 types of nodes (or processes): a coordinator (master) node, a set of worker nodes, and a client node.
The coordinator node: A dedicated central node monitors the worker nodes to detect worker node failures using a custom heartbeat-ack protocol library that we developed called fcheck. The coordinator instructs the worker nodes to save their progress in the query to persistent storage on a periodic basis. This process that Pregel calls checkpointing is used for fault tolerance. When a worker node is detected as failed, the coordinator synchronizes the state of the remaining worker nodes to the last checkpoint state that was saved by all workers to resume the computation.
Worker nodes: The worker nodes are assigned graph partitions (i.e. a set of vertices) by the coordinator node. Workers are responsible for calling a compute function on their active vertices.
The client node: In the current version of our system, the coordinator interacts with one client node at a time. The client node provides the coordinator with information about a specific graph analysis to complete. For example, it may provide two vertices in the dataset and ask for the shortest path between the nodes.
Fault tolerance is achieved through a process called checkpointing. Every n supersteps, the coordinator node instructs the worker nodes to save the current state of their partition to their local storage. In this project, we assumed that worker processes that run on a machine can fail, but the worker machines themselves will not fail. If we had not made this assumption, the checkpoints would have needed to be stored in centralized storage as opposed to the local storage of worker machines.
Our project is written in the Go programming language. We deployed the various components of our system, namely, the worker nodes, the client node, and the coordinator node on individual Azure virtual machines. We utilized the Azure SQL database to store the graph(s) to run queries on.
The extensions we would like to make to this project include the following:
- Currently, we are using a pre-stored graph for queries. We would like to give users the option to add a new graph for queries and choose from a set of graphs.
- Currently, our system only handles one client query at a time. We would like to allow concurrent clients.
- Currently, we assume that there will be a fixed number of worker processes during a query, and that processes that fail on a worker machine will come back up during the query. We would like to allow a varying number of workers during a query.
If you have any questions about our project or would like to learn more about it, please feel free to reach out to the team!