Plotting Data

User 2353 | 10/3/2015, 12:20:01 AM

Hello guys, How are you?

Well, I was here testing new functions and applications to price prediction and I noticed the following: `

Importing data

all_data = graphlab.SFrame ('data.gl/')

'Print'

all_data.show (view = "Scatter Plot" x = "feature1" y = "feature2") ` Every time I ran the function to display the data (based on feature1 and feature2) the information plotted in the graphic changes.. I'm confuse now, because the imported data is always the same..

Can anyone help?

Comments

User 4 | 10/5/2015, 6:18:53 PM

Hi @cottalucas, this behavior is due to the scatter plot taking a sample of the data before plotting. With more than 1,000 points, the scatter plot starts to become less useful to see individual points, so we sample to 1,000 before plotting.

If you want to see all points aggregated (binned) over ranges, use Heat Map rather than Scatter Plot. This type of plot bins values over two axes and takes all data points into account.

If you want to see a consistent sample in Scatter Plot, you can sample first, then plot: sampled = all_data.sample(1000 / float(len(all_data))) sampled.show(view="Scatter Plot", x="feature1", y="feature2")