weird problems while loading models from disk; kernel dies unexpectedly while calculating mean()

User 5322 | 6/23/2016, 10:05:05 PM

Hi, I encountered a bunch of weird issues while working on a ML project. I don't even know how I can concisely report them... well, I'll try. I was testing a kNN model on various options for the training set. I already had a working setup, which was slow but worked fine. I decided to speed up my training/evaluation process and created a grid of training/testing samples. Again, everything worked fine. Then, I introduced just a single modification. I needed to filter out training examples of, let's say, class=X with low counts for this class out of my training slices. So, this is where all troubles started. Problem-1. I could filter out these low count examples. I could train a model. I could even make predictions for the model. However, when I tried to calculate mean() for my output variables, my kernel died unexpectedly (exactly the same code worked just fine if I didn't attempt to filter out low-count examples). Problem-2. After loosing many hours of work due to problem-1, I started saving intermediate results to disk. I.e., I saved prepared slices of my training sample after filtering out low-count examples (this is a lengthy process, takes about two hours in my case). I saved trained model for each slice of the training sample from the previous step. I ran the model predictions and saved the result (SFrame) just before calculating mean() on it. Weirdness-1: this SFrame happens to be too large for what I'm doing. Weirdness-2: kernel dies if I try to calculate mean() on this SFrame right after it got created, but if I start everything from scratch in a new IPython notebook and load this SFrame from disk, then all of the calculations run just fine (including those which caused a crash in the first place). Now, I wanted to re-run my model on another test sample, so I started a new notebook and started loading the model which I just saved and successfully run with in previous steps (before the kernel died). Apparently, I could not load this model because of some weird problems. Here is what I get: RuntimeError: Runtime Exception. Unable to load model from /AnalysisPass3/tmpfolder/models/modelfor_slice245/fb6c959d-da62-4bcd-b8c7-54136c5cab5e: vector I'm stuck and puzzled at this stage. I would very much appreciate any advice or insights into what might be happening. I must admit that I'm just a beginner with Python and might be doing something really stupid with my code, but I got so far without major problems and this particular issue is totally mysterious for me. I can send my notebook, if anyone is willing to look into it.

I apologize for a long post. Thanks, Alex.

Comments

User 940 | 6/24/2016, 8:07:50 PM

Hi @expandinguniverse76,

I'm sorry you're having issues. Sending a notebook and trying to reproduce the issue is probably the easiest thing to do here.

Cheers! -Piotr


User 5322 | 6/24/2016, 9:56:06 PM

Hi @piotr,

Thanks for offering your help. The Frame I'm working with is pretty big: 1.3Gb. So, I'm wondering what would be the most practical way to debug. Should I send my actual notebook (the one I was running) or prepare a smaller version with all of the code/functions used and comments what happens when?

Cheers, Alex.


User 940 | 6/24/2016, 10:16:33 PM

Typically, the simplest code that reproduces the issues makes things more simple to debug. So if possible, a smaller version would be awesome. Otherwise, DM me and maybe we can use something like google drive.

Thanks for your patience. Cheers! -Piotr