Unable to get imagenet model

User 2255 | 9/23/2015, 9:56:41 PM

I am consistently having difficulty downloading the ImageNet trained deep learning network at http://s3.amazonaws.com/dato-datasets/deeplearning/imagenetmodeliter45.

This occurs whether I am trying to <a href="https://dato.com/products/create/docs/graphlab.toolkits.deeplearning.html">train the MNIST digits classifier</a>, or when directly using <code>graphlab.loadmodel('http://s3.amazonaws.com/dato-datasets/deeplearning/imagenetmodel_iter45')</code>.
Invariably, I get a stack trace similar to that below. Am I missing a step?

  • I have updated my graplab-create to version 1.6;
  • I activate my dato-env virtualenv (and the prompt indicates that it's working);
  • When I first invoke a graphlab command, I get a licensing message that suggests that all should be well:

<pre>[INFO] This non-commercial license of GraphLab Create is assigned to wilber@ssl.berkeley.eduand will expire on September 15, 2016. For commercial licensing options, visit https://dato.com/buy/.</pre>

  • When following the linke to http://s3.amazonaws.com/dato-datasets/deeplearning/imagenetmodeliter45, I get this response from Amazon's server:

<pre>&lt;Error&gt;
&lt;Code&gt;NoSuchKey&lt;/Code&gt;
&lt;Message&gt;The specified key does not exist.&lt;/Message&gt;
&lt;Key&gt;deeplearning/imagenetmodeliter45&lt;/Key&gt;
&lt;RequestId&gt;13ACE684D6873BCF&lt;/RequestId&gt;
&lt;HostId&gt;iPODMoSMKuYl ... fCPJ+giG3YIEC0Bu2dmpwtnuLTHTUqlLCjBqcBofJC4&lt;/HostId&gt;
&lt;/Error&gt;</pre>

The error when using GraphLab:
` extractor = gl.feature_engineering.DeepFeatureExtractor(feature = 'image', model='auto')


IOError Traceback (most recent call last) <ipython-input-3-162501278798> in <module>() 1 extractor = gl.feature_engineering.DeepFeatureExtractor(feature = 'image', ----> 2 model='auto')

/home/wilber/work/Galvanize/dato-env/lib/python2.7/site-packages/graphlab/toolkits/featureengineering/deepfeatureextractor.pyc in init(self, feature, model, outputcolumnname) 147 "http://s3.amazonaws.com/dato-datasets/deeplearning/imagenetmodeliter45" 148 import graphlab as gl --> 149 self.state['model'] = gl.loadmodel(modelpath) 150 if type(self.state['model']) is not _NeuralNetClassifier: 151 raise ValueError("Model parameters must be of type NeuralNetClassifier " +

/home/wilber/work/Galvanize/dato-env/lib/python2.7/site-packages/graphlab/toolkits/model.pyc in loadmodel(location) 61 else: 62 # Not a ToolkitError so try unpickling the model. ---> 63 unpickler = gl_pickle.GLUnpickler(location) 64 65 # Get the version

/home/wilber/work/Galvanize/dato-env/lib/python2.7/site-packages/graphlab/glpickle.pyc in init(self, filename) 450 else: 451 if not _os.path.exists(filename): --> 452 raise IOError('%s is not a valid file name.' % filename) 453 454 # GLC 1.3 Pickle file

IOError: http://s3.amazonaws.com/dato-datasets/deeplearning/imagenetmodeliter45 is not a valid file name. `

Comments

User 940 | 9/24/2015, 5:51:46 PM

Hi @mw0,

I believe this is a certificate issue on the AWS side, see :

https://github.com/aws/aws-cli/issues/1499

and

https://github.com/certifi/python-certifi/issues/26

running pip install certifi==2015.04.28 should help.

Let me know if you have any further issues!

Cheers! -Piotr


User 2255 | 9/24/2015, 7:30:32 PM

The pip install command appeared to work with no difficulties.

Unfortunately, still no luck. Here is the simplest possible example of the failure (but other methods for fetching the ImageNet model fail equally):

<code>import graphlab as gl pretrainedmodel = gl.loadmodel('http://s3.amazonaws.com/dato-datasets/deeplearning/imagenetmodeliter45') </code>

The messages that result:

` [INFO] This non-commercial license of GraphLab Create is assigned to wilber@ssl.berkeley.eduand will expire on September 15, 2016. For commercial licensing options, visit https://dato.com/buy/.

[INFO] Start server at: ipc:///tmp/graphlabserver-16229 - Server binary: /home/wilber/work/Galvanize/dato-env/lib/python2.7/site-packages/graphlab/unityserver - Server log: /tmp/graphlabserver1443122262.log [INFO] GraphLab Server Version: 1.6

PROGRESS: Downloading http://s3.amazonaws.com/dato-datasets/deeplearning/imagenetmodeliter45/dirarchive.ini to /var/tmp/graphlab-wilber/16229/3999df56-6454-4dee-806d-72ab275a5f7b.ini PROGRESS: Downloading http://s3.amazonaws.com/dato-datasets/deeplearning/imagenetmodel_iter45/objects.bin to /var/tmp/graphlab-wilber/16229/3a092213-3706-44cb-9b75-852c073a2201.bin


IOError Traceback (most recent call last) <ipython-input-2-fc40dc0e9dab> in <module>() ----> 1 pretrainedmodel = gl.loadmodel('http://s3.amazonaws.com/dato-datasets/deeplearning/imagenetmodeliter45')

/home/wilber/work/Galvanize/dato-env/lib/python2.7/site-packages/graphlab/toolkits/model.pyc in loadmodel(location) 61 else: 62 # Not a ToolkitError so try unpickling the model. ---> 63 unpickler = gl_pickle.GLUnpickler(location) 64 65 # Get the version

/home/wilber/work/Galvanize/dato-env/lib/python2.7/site-packages/graphlab/glpickle.pyc in init(self, filename) 450 else: 451 if not _os.path.exists(filename): --> 452 raise IOError('%s is not a valid file name.' % filename) 453 454 # GLC 1.3 Pickle file

IOError: http://s3.amazonaws.com/dato-datasets/deeplearning/imagenetmodeliter45 is not a valid file name. `

Most frustrating: it's not until it has downloaded a 244 MB to /var/tmp/graphlab/wilber that I get the error. (Fail slowly.)


User 1178 | 9/24/2015, 8:22:18 PM

Hi,

I think there may have some problem with downloading files from your machine. Can you try the following script and see what exception you see:

import graphlab.connect.main as glconnect
location = 'http://s3.amazonaws.com/dato-datasets/deeplearning/imagenet_model_iter45'
from graphlab.util import _make_internal_url
_internal_url = _make_internal_url(location)
glconnect.get_unity().load_model(_internal_url)

Also please check your awscli version by doing:

pip freeze |grep awscli

We have found awscli version 1.6.2 version works and newer version do not always work. Try to downgrade that version may help too.

Thanks! Ping


User 2255 | 9/24/2015, 8:31:01 PM

I am wondering if this problem is a result of the recent upgrade to graphlab-create v1.6, which I did install. I have a tar file for graphlab-create v1.5, but don't know the steps to take to un-install v1.6 ...


User 2255 | 9/25/2015, 12:08:29 AM

Hi Ping,

<code>pip freeze | grep awscli</code>

returns:

<code>awscli==1.6.2</code>

I tried your approach above and got the a new error:

<code>RuntimeError: Runtime Exception. Unable to load model from http://s3.amazonaws.com/dato-datasets/deeplearning/imagenetmodeliter45: CUDA Error: out of memory</code>

What is strange about this is that I had previously been able to load the model, and extract high-level feature weights using

<code>data['featureweights'] = model.extractedfeatures(data, 21)</code>

Any suggestions? I am using a GeForce GT 750M GPU, which has 2GB of memory (384 cores). Shouldn't this be enough?


User 940 | 9/25/2015, 4:51:08 PM

Hi @mw0 ,

This is indeed strange. When I run nvidia-smi, I see we're using 2.5 GB of memory.

What's your output for nvidia-smi before and after trying to load the model?

If you uninstall graphlab-create with pip uninstall graphlab-create and install pip install graphlab-create==1.5.2 what's the output of nvidia-smi before and after trying to load the model?

Thanks! -Piotr


User 3475 | 3/11/2016, 12:55:51 PM

Hi! How can I disable using GPU? I am on a laptop and when loading the imagenet it fails: extractor = graphlab.feature_engineering.DeepFeatureExtractor(features = 'image', model='auto')

graphlab/cython/cy_unity.pyx in graphlab.cython.cy_unity.UnityGlobalProxy.load_model()
graphlab/cython/cy_unity.pyx in graphlab.cython.cy_unity.UnityGlobalProxy.load_model()
RuntimeError: Runtime Exception. Unable to load model from http://s3.amazonaws.com/dato-datasets/deeplearning/imagenet_model_iter45: CUDA Error: out of memory 

I wasn't expected it to not fit in 1GB so now I need deep learning to run on CPU only. Thanks!


User 15 | 3/11/2016, 9:57:52 PM

Hi @visoft,

There is an easy way to set CPU-only when you're training your own neural net, but when using a builtin one there isn't. This is definitely a case we missed and we'll work to rectify it. Unfortunately for the DeepFeatureExtractor case, the only workaround is to supply you with a model trained with device set to "cpu". We've uploaded one here: https://s3.amazonaws.com/dato-datasets/deeplearning/imagenetiter45_cpu.zip

Unzip this, use load_model within GLC on the unzipped folder, and then pass the returned model to the "model" parameter of the DeepFeatureExtractor. Let us know if this works!

Evan


User 3475 | 3/15/2016, 12:48:15 PM

Well, I needed a quickfix so I just reinstalled the dato in another env, without the gpu upgrade. Then, it worked. I didnt tried yet the imagenetiter45_cpu.zip weights.


User 15 | 3/15/2016, 8:51:31 PM

@visoft That works too. Didn't want to make you uninstall/reinstall things.


User 2334 | 4/11/2016, 2:07:56 PM

Hi Evan! I have the same problem, I downloaded the zip file but if I want to load the model I am getting the following error: ` IOError Traceback (most recent call last) <ipython-input-21-9f0871474e5b> in <module>() ----> 1 imagenetmodel = graphlab.loadmodel('C:\Users\lueck.TA-13\Anaconda\cpu_nn.gl')

C:\Users\lueck.TA-13\Anaconda\lib\site-packages\graphlab\toolkits_model.pyc in loadmodel(location) 61 else: 62 # Not a ToolkitError so try unpickling the model. ---> 63 unpickler = glpickle.GLUnpickler(location) 64 65 # Get the version

C:\Users\lueck.TA-13\Anaconda\lib\site-packages\graphlab_glpickle.pyc in init(self, filename) 482 picklefilename = os.path.join(filename, "picklearchive") 483 if not os.path.exists(picklefilename): --> 484 raise IOError("Corrupted archive: Missing pickle file %s." % picklefilename) 485 if not os.path.exists(_os.path.join(filename, "version")): 486 raise IOError("Corrupted archive: Missing version file.")

IOError: Corrupted archive: Missing pickle file C:\Users\lueck.TA-13\Anaconda\cpunn.gl\picklearchive. `


User 4 | 4/12/2016, 10:10:48 PM

Hi @ete, can you verify that the path C:\Users\lueck.TA-13\Anaconda\cpu_nn.gl\ exists and is a directory, and is readable by the user running the Python interpreter? If so, can you verify that there is a file inside called pickle_archive, again readable by the user running the Python interpreter?


User 2334 | 4/13/2016, 6:14:17 AM

Hi Zach,

the directory is accessible by the user but there is no picklearchive file inside, only dirarchive and objects.bin.

Thanks


User 4 | 4/13/2016, 5:49:03 PM

Hi @ete, it sounds like the cpu_nn.gl directory may be corrupted or partially saved. Where did this saved model come from? I can try to reproduce this issue on my machine if you have an HTTP URL to the saved model.


User 2334 | 4/14/2016, 6:04:30 AM

I simply used the link above from Evan: https://s3.amazonaws.com/dato-datasets/deeplearning/imagenetiter45_cpu.zip


User 4 | 4/14/2016, 11:41:31 PM

Hi @ete, it turns out this error message is misleading and I've opened a bug internally so we can fix it. In this case, the issue is not that pickle_archive is missing (in fact, for this saved model, it should not be there), but that dir_archive.ini is either missing or corrupt. Can you verify the MD5 sum of the download? The MD5 sums should be as follows:

MD5 (imagenet_iter_45_cpu.zip) = fef060acb0ec7352923955e50298daab MD5 (cpu_nn.gl/dir_archive.ini) = d868339072321f9240f40de3a0ca6b04 MD5 (cpu_nn.gl/objects.bin) = 8c19089ac7b95ebd22a2904f5f67ce6b


User 2334 | 4/15/2016, 6:42:26 AM

Hi @Zach! All MD5 sums are correct.


User 4 | 4/15/2016, 9:22:52 PM

Hi @ete, I have verified that on Windows with those MD5 sums, I am able to load the model successfully.

I think there are a variety of situations on your filesystem that could cause the file to be unable to be read by GraphLab Create that would cause the error you are seeing. Please check for all of these:

  • The directory (cpu_nn.gl) may not be visible to the process running GraphLab Create because of insufficient read or execute permissions on the directory itself, a parent directory, or an ancestor directory.
  • The dir_archive.ini file inside the directory may not be visible to or readable by the process running GraphLab Create because of insufficient read permissions on the file itself.
  • The path containing this directory may not be the same between Windows Explorer and the Python process -- for instance, Windows Explorer may automatically expand (show as directories) zip files when they still need to be manually extracted to be used by another process, for instance.

To help diagnose these, you could try the following simple file opening code in Python:

f = open('C:\Users\lueck.TA-13\Anaconda\cpu_nn.gl\dir_archive.ini', 'r') print f.read() f.close()

It should print the contents of the dir_archive.ini file, which look like this:

[archive] version=1 num_prefixes=2 [metadata] contents=model [prefixes] 0000=dir_archive.ini 0001=objects.bin

If you get an error opening the file, or an error reading the contents, it may indicate whether the issue is with the existence of the file in that path, and/or its permissions, independent of GraphLab Create.


User 2334 | 4/18/2016, 6:49:26 AM

Hi @Zach ! Thanks for your suggestions! I have no problem to read the dir_archive.ini file. Just to be on the save site, i will try again from home this evening. Sometimes our firewall might create problems. Cheers