How to use GLC/SFrames for Data in a database

User 5313 | 6/21/2016, 10:01:40 PM


I have a question about analyzing data from a database.

I know that I can easily create SFrames from a csv file etc.

However, the data I want to analyze sits in a database, not in a file.

What methods are people using when the data they want to analyze with graphlab is in a database? What is the usual/graphlab recommended method?

I have had two strategies for dealing with this so far:

1) Use pandas to create a csv file from the data in the database and create an SFrame from that csv. (not good) 2) Writing my own function that takes data from the database (mongodb) using a python library (pymongo) and creating an SFrame myself. This works but I thought there probably is an easier way of doing this?

I will appreciate any help, and opinions etc. Thanks a lot!


User 1592 | 6/22/2016, 5:17:01 AM

Hi It is actually much easier than you imagined as we support direct reading and writing from databases as explained here:

User 5313 | 7/8/2016, 10:05:43 AM

Hi @DannyBickson

I just saw your reply. Thanks a lot. A few issues:

  1. The link you posted does not work. (perhaps due to name change from dato to turi?) I am guessing you were pointing here:

These instructions seem to be written for sql. "[ODBC] remains one of the most universal ways to communicate with SQL databases"

Any help you can give me on working with mongodb? Anyone?

I currently have an IP address to the mongo server of my company and would not know to set up ODBC on my own. A quick google search for odbc for mongodb was not super helpful either, even after assuming graphlab would support it. A set of instructions similar to those above for odbc-sql would be helpful.


PS: is there a way to be notified of replies to these posts? I would like to be notified when someone responds to my question.

User 5313 | 7/9/2016, 12:51:19 PM

Any response?