Comparison of SGD implementation in PowerGraph and Create

User 532 | 12/8/2014, 10:02:30 PM

Hello all,

I am playing with both PowerGraph and GraphLab Create using the SGD algorithm and I am getting very different results. Especially, the RMSE error is much lower (about half) when I am using Create. I am not an ML expert, so I just wanted to use Create to find the correct values for the model parameters that I need to use for SGD in PowerGraph. However, there is not a one-to-one correspondence between the command line arguments of PowerGraph and the model parameters in Create.

1) Is the implementation of SGD in Create the same as in PowerGraph ? 2) What do the gamma and lambda command line args from PowerGraph correspond to in Create?

---- PowerGraph ---- --gamma=XX Gradient descent step size --lambda=XX Gradient descent regularization --step_dec=XX Multiplicative step decrease. Should be between 0.1 to 1. Default is 0.9 --D=X Feature vector width. Common values are 20 - 150.

---- Create ---- Settings additional iterations if unhealthy: 5
num factors: 8
init random sigma: 0.01
max iterations: 50
regularization type: normal
side data factorization: 1
regularization: 1e-06
sgd step size: 0.0
sgd trial sample proportion: 0.125
binary target: 0
nmf: 0
track exact loss: 0
sgd trial sample minimum size: 10000
sgd convergence interval: 4
solver: auto
sgd convergence threshold: 1e-05
sgd max trial iterations: 5
step size decrease rate: 0.75
linear regularization: 0.0

Thank you! Vicky

Comments

User 89 | 12/9/2014, 7:24:03 AM

Hello Vicky,

I'm excited to see your comparisons!

The biggest difference between the two is the scaling on the regularization parameter (lambda in PowerGraph and just "regularization" in GL Create). See <a href="http://forum.graphlab.com/discussion/608/the-way-i-trained-my-factorization-model-graphlab-create-0-9-1-doesnt-seem-to-work-in-1-0-why"> my earlier post</a> for more information.

Other than that, the sgdstepsize -- the gamma in PowerGraph -- is chosen automatically in GLC, but if you specify a value > 0, they should behave similarly. The underlying algorithms are slightly different because of how PowerGraph schedules the updates, but you should see similar behavior between the two. The stepsizedecreaserate (default is stepdec=0.9 in PowerGraph and 0.75 in GLC) is another parameter you can tune. These should be comparable. The number of factors (--D=X in PowerGraph and num_factors in GLC) should also be comparable.

Thanks! -- Hoyt