Why is there the "Edge Consistency model" ?

User 121 | 6/23/2014, 2:19:21 PM

Hi all,

After waiting for long time for an answer on http://forum.graphlab.com/discussion/229/consistency-models#latest

I read some source code to learn more about consistency models, especially the following file https://github.com/graphlab-code/graphlab/blob/master/src/graphlab/engine/asyncconsistentengine.hpp

I am trying to deduce the meaning of this statement: "The default mode is "factorized" consistency in which only individual gathers/applys/ scatters are guaranteed to be consistent, but this can be strengthened to provide full mutual exclusion"

  1. For Edge consistency model: if Vertex V1 and V2 have an edge between them, then all gather(), apply(), scatter() at V1 must be AFTER or BEFORE gather(), apply() and scatter() on V2. (i.e. GAS on V1 must be OUT OF PHASE with GAS on V2) Is this correct ?

  2. For factorized consistency model: if vertex V1 and V2 have an edge between them, then gather() can NOT happen simultaneously at both vertices, same of apply() and scatter(). Is this correct ? (I think not, because apply() can be executed simultaneously even on adjacent vertices)

  3. Full mutual exclusion means nothing but the edge consistency model.

Apart from deduction of this statement, I would like to know: Why is the Edge consistency model still there ? Which algorithms / special cases would need it ? I observed that it uses distributed Chandy Misra solution to the Dining philosopher's problem. It models vertices for philosophers and edges for forks. In every function of asyncconsistenyengine.cpp, if edge consistency is used, it costs lot of time to acquire/release locks and to wait. When is this stricter consistency justifiable ?

I will appreciate your help! [Note: I am Master student at VU,Amsterdam and doing my thesis on comparison between Giraph and GraphLab]


User 20 | 6/25/2014, 4:47:33 PM

1) Edge consistency essentially means that when any vertex is executing (the entire Gather-Apply-Scatter phase), no neighbors are executing. Basically the execution process is: run_vertex(v) { - acquire lock on all adjacent edges - run Gather, Apply Scatter - release lock on all adjacent edges } This implicitly guarantees that the GAS is "atomic" locally. i.e. while the current vertex is running the GAS, no neighboring value can be changed.

2) factorized consistency means that Gather-Apply-Scatters can arbitrarily interleave: but essentially we only provide very local guarantees that during the gather/scatter of an edge, no one else can be reading/writing to it. The execution process looks a little like: run_vertex(v) { for each e adjacent to v: acquire lock on e gather(e) release lock on e acquire lock on v apply(v) release lock on v ... scatter is the same as gather ... }

Stricter consistency may be very helpful for certain algorithms. For instance Gibbs sampling (to do it correctly at least), does formally require full edge consistency. When implementing certain more complex graph-theoretic algorithms (we encountered a case with parallel junction tree growing, and I speculate, for certain max-flow algorithms), it is very helpful to ensure that while you are running a GAS, none of your neighbors are changing.

User 121 | 6/26/2014, 11:04:53 AM

Thank you Yucheng, For asyncconsistentengine, I did not see any edge locks (as there are in synchronousengine) , however, the code locks BOTH ends of the edge. (performgather() function). As vertex_data is const in gather(), why is it necessary that the vertex is locked ? Is it because: "v1 and v2 have edge between them. v1 is doing gather() and v2's apply() overlaps with this gather() : this should not cause problem, because apply() will not change the edge data. But if v2's scatter() overlaps with v1's gather() then we are in trouble." But then, edge should be locked not the vertex.. Am I missing some semantics ? or internals of locking ? Please point me to the right source file to look more.. Thank you again

User 20 | 6/26/2014, 6:16:20 PM

The "edge-locks" is more of a conceptual idea. There are many possible ways to implement it. Basically, you don't want an "apply" to modify either end of the vertex while you are reading it. For instance if I am running GAS on vertex v, and gathering on edge v--u, I do not want "u" to change while I am reading it. So technically I need at minimum a read-lock on v and a read-lock on u.

However, for short functions, you will find that rwlocks are much slower than just mutexes. thus the async engine just works by acquiring a lock on v and lock on u. (at least we have not really encountered contention issues. As long as the graph is big enough, a collision is not likely.)

Yet another low contention alternative is to make a copy of vertex u then just lock the edge. But then you have to pay for the copy.

User 121 | 6/28/2014, 10:18:05 AM

For gather(), we are not allowed to change the vertex data, but edge_data can be changed (according to the prototype of the gather() function. So, what is the meaning of locking u and v, when only locking the edge u-v, can give more granular locking ? Or locking u, means lock the vertex-data and edge-data of u ?

User 20 | 6/30/2014, 5:39:52 PM

So these locks I am talking about here are local locks associated with each vertex. so a lock on u and v, means to acquire those particular locks.

While gather is not allowed to change the vertex data, apply is. And basically you want to avoid the following condition

Thread 1: t = 1: starting GAS on vertex v t = 2: Gathering Edge v-u

Thread 2: t = 0: starting GAS on vertex u t = 1: finished gathering t = 2 apply on vertex u

The gather in thread 1 will collide with the apply with thread 2 if thread 1 does not acquire locks on the vertices. As mentioned, above, at minimum, gathers must acquire read locks on the vertices on both ends of the gather, and applies must acquire write locks on the vertices.

User 121 | 7/2/2014, 9:42:03 AM

But this distinction between read and write locks is not that explicit in the code: For example, in asyncconsistentengine.hpp, both gather() and apply() use vertexlocks[index].lock() - > which internally is a vector of simplespinlocks pthreadtools.hpp actually has rwlock class, which I could not locate elsewhere in the source code tree. am I missing something ? conceptually your explanation makes sense, but I cannot make the same deduction with the code.