tfidf bug?



today I have encouraged a really strange behaviour of the function graphlab.textanalytics.tfidf. I have created a really simple dataset consisting of three documents:

	[{'s1': 3.0}, {'s2': 4.0}, {'s1': 5.0}]

after computing tfidf scores, I got

	[{'s1': 0.0}, {'s2': 4.394449154672438}, {'s1': 0.0}]

According to documentation, tfidf score is given by following formula


so for second document: 4.0 log(3.0/1.0) = 4.394449154672439, but for the first one: 3.0 log(3.0/2.0) = 1.2163953243244932.

Am I missing something or this is a bug?


Hi ziky,

Thanks for using GraphLab Create. This is a known issue and we've addressed it in the latest version of GraphLab Create (1.4) which is to be released very, very soon (hopefully before the end of the week). Sorry for the inconvenience.


Ok, thanks.