User 761 | 10/30/2014, 10:49:05 AM
Hey! I'm trying to parallelise cross-validation using the python multiprocessing module.
Here is a code snippet: pool = Pool(processes=3) #where 3 is the number of folds output = pool.map( worker,[sf,sf,sf] )
Where 'sf' is an SFrame containing the data and 'worker' is a function that executes one fold of cross-validation. However, I get the following error:
Traceback (most recent call last): File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 258, in bootstrap self.run() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 114, in run self.target(*self.args, **self.kwargs) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 102, in worker task = get() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/queues.py", line 376, in get return recv() File "cysframe.pyx", line 51, in graphlab.cython.cysframe.UnitySFrameProxy.cinit TypeError: cinit() takes at least 1 positional argument (0 given)
Could the be related to a pickling issue? Maybe SFrames can't be pickled? Any help would be greatly appreciated.
Also, are there any better ways to go about parallelising the cross validation? How do you guys do it?
(Any plans to include a cross validation module in the future?)