Parquet file format and Sframes

User 3177 | 2/4/2016, 6:43:38 PM

Bare with me on this one as I'm new to Sframes and Parquet.

I basically wanted to find out if there is any way to import data stored in parquet format directly into an sframe? I can't seem to find any documentation on it and I have a number of uses cases were I have data stored in parquet format.

I believe I could go from parquet -> Spark Dataframe -> Sframe but was hoping there was a more direct way.

Comments

User 15 | 2/4/2016, 9:37:53 PM

Hi @dwolcott ,

No, currently SFrame doesn't support a direct conversion from Parquet files. The Parquet->Spark Dataframe -> SFrame path you mentioned would be the current best way to do it.

We've discussed Parquet support but as of yet it hasn't been prioritized. We'll note your interest in our future planning. We'd also accept contributions to SFrame along this line, as the source is out there...the work just needs to be done. If you or anyone else has the bandwidth or interest for this to get to it before we do, then let me know! I'd be happy to assist as much as possible.

Evan