User 761 | 12/8/2014, 11:11:26 AM
Hi! I had some basic doubts about how Graphlab uses multiple cores or multiple nodes (in a Hadoop cluster) to do computation.
Case A : Single machine, multiple cores Case B : Multiple nodes in a cluster
1.Does Graphlab parallelise (use all available cores/nodes) all computations that can be done in parallel?
2.Also, say I have a classification task. I run experiments and make a model which I save. Now, when I load a saved model and run it against new data to generate labels using classifier.evaluate is that computation done in parallel? (assuming multiple cores/nodes are available) Suppose I'm running 100 new samples through the saved classifier. All the 100 computations are essentially independent. I ask because we will be potentially running 10s of millions of samples through the saved classifier to generate labels, in which case a non-parallel implementation will be very expensive.