graphlab.SFrame.apply() and NLTK: ImportError: No module named transport.adapter

User 2565 | 2/1/2016, 8:13:48 PM

Something goes wrong when I use NLTK within a SFrame.apply() The NLTK code works fine by itself. The example should be reproducible easily, although you may need to install the corpus with import nltk nltk.download() and select 'wordnet'.

import graphlab as gl
from nltk.corpus import wordnet as wn
import nltk

def find_synonyms(words):
    synonyms = list()
    for word in words.split(' '):
        syn = [syn_dog.lemma_names() for syn_dog in wn.synsets(word)]
        unique_syn = list(set([item for sublist in syn for item in sublist]))
        synonyms.append( [u_s.replace('_', ' ') for u_s in unique_syn] )
    return synonyms

search_term = gl.SFrame({'id': [1, 2, 3],
                         'search_term': ['blue dog', 'hot caravan', 'crumble pie']})

# this works fine
for term in search_term['search_term']:
    print find_synonyms(term)

# this doesn't!
search_term_synonyms = search_term['search_term'].apply(find_synonyms)
print search_term_synonyms

Here is the output:

$ python synonyms.py
[INFO] GraphLab Server Version: 1.8.1
[[u'blue-blooded', u'naughty', u'drab', u'juicy', u'blue sky', u'bluish', u'down', u'gentle', u'blueing', u'risque', u'Amytal', u'amobarbital sodium', u'blue', u'bluing', u'disconsolate', u'blasphemous', u'blue devil', u'patrician', u'aristocratic', u'aristocratical', u'blueness', u'drear', u'sorry', u'wild blue yonder', u'low', u'dispirited', u'blue angel', u'puritanical', u'dreary', u'gamy', u'downhearted', u'gloomy', u'dismal', u'low-spirited', u'spicy', u'blueish', u'grim', u'dark', u'down in the mouth', u'puritanic', u'profane', u'depressed', u'downcast', u'racy', u'dingy', u'blue air', u'gamey'], [u'go after', u'chase after', u'pawl', u'dog', u'wiener', u'tag', u'frankfurter', u'hound', u'click', u'chase', u'andiron', u'hot dog', u'tail', u'Canis familiaris', u'give chase', u'wienerwurst', u'bounder', u'domestic dog', u'track', u'frank', u'trail', u'blackguard', u'weenie', u'frump', u'firedog', u'detent', u'dog-iron', u'cad', u'heel', u'hotdog']]
[[u'blistering', u'red-hot', u'raging', u'live', u'hot', u'spicy'], [u'caravan', u'train', u'wagon train', u'van']]
[[u'collapse', u'tumble', u'fall apart', u'decay', u'crumble', u'dilapidate', u'crumple', u'break down'], [u'Proto-Indo European', u'pie', u'PIE']]
Traceback (most recent call last):
  File "synonyms.py", line 21, in <module>
    search_term_synonyms = search_term['search_term'].apply(find_synonyms)
  File "/Users/huguesfo/anaconda/lib/python2.7/site-packages/graphlab/data_structures/sarray.py", line 1693, in apply
    return SArray(_proxy=self.__proxy__.transform(fn, dtype, skip_undefined, seed))
  File "/Users/huguesfo/anaconda/lib/python2.7/site-packages/graphlab/cython/context.py", line 49, in __exit__
    raise exc_type(exc_value)
ImportError: No module named transport.adapter
[INFO] Stopping the server connection.

Thanks for looking into this.

Comments

User 2535 | 2/1/2016, 9:34:03 PM

Hi @hugues,

Could you try doing the nltk imports inside the function?

Let us know how it goes.


User 2565 | 2/1/2016, 11:03:19 PM

Thanks @jon for the fast answer and suggestion. Putting the from nltk.corpus import wordnet as wn inside the function worked. Hurray! The other import nltk could be left outside. It went fine with my "real" dataset as well. If I may ask, what's going on here? Cheers Hugues