There can be some challenging issues inherent with how our current model handles side information, and it seems you may be hitting some of these. We are working on a much better way to model side information internally, which would hopefully resolve some of these issues, but it's not currently in GLC.
First, a surprising amount of information is contained in the user and item interaction pairs. Typically, the model finds the side information much less useful than one would expect -- many types of side information don't tend to help that much.
Second, while it may indeed overfit -- I've definitely seen that happen -- what actually ends up happening is that the model is more complicated, and this tends to result there being a lot more local minima coming up in the problem. A local minima is a set of factor values which may not be close to the best values, but where there is no obvious way for the optimization to improve them. Thus it gets stuck at a value that is not really optimal, and your model is worse than the MF model.
The main thing that I have found helpful in addressing this is to use categorical features instead of numerical side features -- if you have numerical side features, try binning them using one of the feature transformers, which puts them the values into bins of numbers instead of working with the value directly. Often the best model is one with all numerical features binned, sidedatafactorization set to False, and linear_regularization adjusted to a value comparable to the regularization value. Still, don't be surprised if your model is not substantially better, as that info is contained in the user interaction data already.
Hope that helps!