Memory allocation error in label propagation

User 2212 | 9/1/2015, 9:31:10 PM

I tried to run the label propagation algorithm on my user to item graph: There are about 2000000 edges in the graph. m = gl.labelpropagation.create(g, labelfield='label', undirected=True)

PROGRESS: Num classes: 875358 PROGRESS: #labeledvertices: 875358 #unlabeledvertices: 924076 Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/ubuntu/venv-dato/local/lib/python2.7/site-packages/graphlab/toolkits/graphanalytics/labelpropagation.py", line 269, in create params = main.run('labelpropagation', opts, verbose) File "/home/ubuntu/venv-dato/local/lib/python2.7/site-packages/graphlab/toolkits/main.py", line 64, in run (success, message, params) = unity.runtoolkit(toolkitname, options) File "graphlab/cython/cyunity.pyx", line 70, in graphlab.cython.cyunity.UnityGlobalProxy.runtoolkit File "graphlab/cython/cyunity.pyx", line 74, in graphlab.cython.cyunity.UnityGlobalProxy.runtoolkit MemoryError: std::badalloc

How can I resolve this issue please.

Comments

User 2212 | 9/2/2015, 12:29:56 AM

Is there a limit on graph size for label propagation?


User 1592 | 9/3/2015, 6:59:03 AM

Hi Lucia We assume there is a small fixed set of labels you want to label your unknown graph nodes with. For example, labels could be: sport, fashion, politics etc.

In your case you have around 900K nodes, and each has his own unique label. This create a vector of 900K label probabilities for each node and you run out of memory.

My suggestion is to narrow the number of labels to a small number.

Thanks