User 4751 | 4/14/2016, 8:19:54 PM
I performed some text analysis with NLTK inside a Spark RDD and want to export it into an SFrame. This is what I'm doing.
from pyspark import SparkContext sc = SparkContext() import graphlab from graphlab import SFrame rdd = sc.parallelize([1,2,3]) sf = gl.SFrame.from_rdd(rdd, sc) sf
I get the usual messy error from spark (Py4JJavaError... etc. ...screenshot attached).
My current workaround: use spark to save as text files, then read in using SFrame.read_csv, which is not efficient.
running GLC and Pyspark... versions: GLC v1.8.5 Pyspark 1.4.1