Autotagging Hacker News Posts

User 2 | 4/27/2015, 6:36:22 AM

        <div class="EmbeddedContent"><img src="https://dato.com/images/dato-logo-stacked-1200x630.png" class="LeftAlign" /><strong>Autotagging Hacker News Posts</strong>
           <p>An autotagger model matches unstructured text queries to a reference set of strings, a.k.a tags, which are known beforehand. It is similar to the task of fuzzy matching. But unlike fuzzy matching, autotagging is typically done with a fixed set of tags, and it treats the unstructured documents as the queries.</p>
           <p><a href="https://dato.com/learn/gallery/notebooks/autotagging_hacker_news_posts.html">Read the full story here</a></p>
           <div class="ClearFix"></div>
        </div>

Comments

User 1817 | 4/27/2015, 6:36:23 AM

storiessf = gl.loadsframe("s3://dato-datasets/hackernews/storieswithtext.sframe") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Library/Python/2.7/site-packages/graphlab/datastructures/sframe.py", line 230, in loadsframe sf = SFrame(data=filename) File "/Library/Python/2.7/site-packages/graphlab/datastructures/sframe.py", line 860, in init raise ValueError('Unknown input type: ' + format) File "/Library/Python/2.7/site-packages/graphlab/cython/context.py", line 39, in exit raise exctype(excvalue) KeyError: KeyError('No access key found. Please set the environment variable AWSACCESSKEYID, or using graphlab.aws.setcredentials()',)

how to solve this?


User 1178 | 4/27/2015, 9:41:38 PM

Hi Hao,

Since you need to download data from S3, a valid AWS credentials is needed. You may set your AWS credentials through the following commander inside your Python session:

gl.aws.setcredentials(<yourawskeyid>, <yourawskey_credentials>)

You may also set them through your environment variable:

export AWSACCESSKEYID='<your-aws-key-id>' export AWSSECRETACCESSKEY='<your-aws-key-credentials>'

Thanks!


User 1178 | 4/28/2015, 4:55:56 PM

Hi Hao,

To add to previous response, Dato provides a a read only credentials that you can use to download the example dataset above. You can do the following in your Python session:

gl.aws.set_credentials('AKIAJMHKEZGY6YP24BXA', 'vf/miz2Zx7V7VkCai9ZeJR45ZSimqu6/W7qdRLmN')

Hope that helps!

Ping