Why running for loop of SFrame extremly slow?

User 5246 | 5/29/2016, 8:17:22 AM

Hi, I created an SFrame of two Sarrays ('hotelcluser' and 'besthotel'), it looks like this:

+---------------+--------------------------------+ | hotelcluster | besthotel | +---------------+--------------------------------+ | 18 | [68.0, 70.0] | | 25 | [25.0] | | 24 | [24.0] | | 46 | [36.0, 29.0, 81.0, 46.0, 5.0] | | 29 | [36.0, 29.0, 81.0, 46.0, 5.0] | | 46 | [36.0, 29.0, 81.0, 46.0, 5.0] | | 36 | [36.0, 29.0, 81.0, 46.0, 5.0] | | 59 | [36.0, 29.0, 81.0, 46.0, 5.0] | | 6 | [36.0, 29.0, 81.0, 46.0, 5.0] | | 36 | [36.0, 29.0, 81.0, 46.0, 5.0] | | 28 | [28.0] | | 95 | [95.0] | | 95 | [95.0] | | 77 | [77.0] | | 55 | [55.0] | | 56 | [39.0, 41.0, 70.0, 98.0, 79.0] | | 81 | None | | 82 | [57.0, 62.0, 36.0, 12.0, 81.0] |

I try to loop through the SFrame to check for each row that is hotelcluster in besthotel. using the code below

length = len(SFrame_name) for i in range(length): if SFrame_name['best_hotel'][i] == None: continue if float(SFrame_name['hotel_cluster'][i]) in SFrame_name['best_hotel'][i]: count += 1 if i%2000 == 0: print ('Read {} lines...'.format(i))

It run extremly slow...but can get result

but when I cast sarray to list using list(SFrame_name), and then doing almost same thing, I got result instantaneously. What's wrong with my code?

Comments

User 5159 | 5/30/2016, 2:11:55 AM

Looping over SFrame is not efficient, consider to use lambda function instead of looping