Why the graphlab code for pagerank using triple_apply is 100x slower than the serial code?

User 1886 | 5/5/2015, 4:57:21 PM

Dear All,

First, I would like to congratulate the developers of graphlab/dato - it seems a really great tool to solve graph-based problems!

I am a beginner with graphlab and I would need your help to understand how to use the full potential of graphlab.

Here is my problem.

I have developed a new pagerank-based algorithm and I wanted to parallelize it to speed it up. My MacBook Pro has 8 cores and I have also access to a linux machine of 16 cores.

So I looked at this graphlab code for pagerank using the tripleapply function: https://github.com/dato-code/how-to/blob/master/tripleapplyweightedpagerank.py and I just modified it to fit my pagerank algorithm. I got the same output solution (in terms of values) using the graphlab code and the serial python code for my pagerank-based algorithm.

The issue is that the graphlab python code is 100x SLOWER than the serial python code.

I thought the issue came from my graphlab code so I implemented the standard pagerank algorithm in serial python (see below), and compare it with https://github.com/dato-code/how-to/blob/master/tripleapplyweighted_pagerank.py The graphlab speed is still 100x slower than serial.

So here are my questions: (1) Why the graphlab pagerank algorithm is slow? (2) How to make it fast?

Thanks a lot for your help Xavier

PS. My serial python code for pagerank: for iter in range(10): # Pagerank update F = (1 - reset_prob) W F + reset_prob

Comments

User 1190 | 5/5/2015, 5:51:14 PM

Hi @graphguy,

The python triple_apply is meant to ease the prototype of graph analytics algorithms using a single abstraction. It provides a simple way to reason about a graph algorithm. It is expressive to capture a wide range of existing algorithms. You can use SGraph/SFrame to do data engineering, and express your algorithm using a single function that is applied to each edge in parallel.

Yes, python triple_apply has overhead, and it is not difficult to beat it with single thread algorithm. In fact, a well implemented specialized single thread algorithm can beat distributed system by a fair margin http://www.frankmcsherry.org/graph/scalability/cost/2015/02/04/COST2.html The benefit of general purpose system/framework is that you don't have to implement everything differently. You can write pagerank using matrix vector product. But for connected components, or community detection, you need different implementations.

Why is the graphlab pagerank slow? To clarify, the native pagerank, i.e. pagerank.create(), is pretty fast. Why is python tripleapply slow? There are three major overheads: 1. Data is being communicated between the main process and the python process. 2. Vertex and edge data are being converted between native representation (vector) and python dict. 3. At the end of each tripleapply, data is being written to disk.

How can you make things faster? 1. You can reduce the amount of data being communicated by specifying "inputfields" and "mutatedfields" in the triple_apply 2 and 3, we are working on a better abstraction which can eliminate such overhead.

If you are familiar with C++, you can use our SDK with the same tripleapply abstraction. It is in parallel and will eliminate the overhead of 1 and 2. Example: https://github.com/dato-code/GraphLab-Create-SDK/blob/master/sdkexample/sgraphweightedpagerank.cpp

Thanks, -jay


User 1886 | 5/6/2015, 4:21:07 PM

Hi @"Jay Gu",

Thanks a lot for your detailed answer! I really appreciate it.

As you suggested, I would like to use your SDK implementation (https://github.com/dato-code/GraphLab-Create-SDK/blob/master/sdkexample/sgraphweighted_pagerank.cpp) to eliminate the overhead of 1 and 2.

Before investing time in SDK, may I ask you how much speed up (10x?) did you get with your SDK implementation of pagerank compared to its python/graphlab implementation (https://github.com/dato-code/how-to/blob/master/tripleapplyweighted_pagerank.py)?

Thanks again very much for your answer. Xavier


User 1190 | 5/7/2015, 4:38:45 PM

At least 20x


User 1886 | 5/15/2015, 6:35:53 AM

Thanks Jay Gu! Xavier