I have a question about analyzing data from a database.

I know that I can easily create SFrames from a csv file etc.

However, the data I want to analyze sits in a database, not in a file.

What methods are people using when the data they want to analyze with graphlab is in a database? What is the usual/graphlab recommended method?

I have had two strategies for dealing with this so far:

1) Use pandas to create a csv file from the data in the database and create an SFrame from that csv. (not good) 2) Writing my own function that takes data from the database (mongodb) using a python library (pymongo) and creating an SFrame myself. This works but I thought there probably is an easier way of doing this?

Hi It is actually much easier than you imagined as we support direct reading and writing from databases as explained here:

Hi @DannyBickson

I just saw your reply. Thanks a lot. A few issues:

  1. The link you posted does not work. (perhaps due to name change from dato to turi?) I am guessing you were pointing here:

These instructions seem to be written for sql. "[ODBC] remains one of the most universal ways to communicate with SQL databases"

Any help you can give me on working with mongodb? Anyone?

I currently have an IP address to the mongo server of my company and would not know to set up ODBC on my own. A quick google search for odbc for mongodb was not super helpful either, even after assuming graphlab would support it. A set of instructions similar to those above for odbc-sql would be helpful.


PS: is there a way to be notified of replies to these posts? I would like to be notified when someone responds to my question.

