Autotagging Hacker News Posts

User 2 | 4/27/2015, 6:36:22 AM

        <div class="EmbeddedContent"><img src="" class="LeftAlign" /><strong>Autotagging Hacker News Posts</strong>
           <p>An autotagger model matches unstructured text queries to a reference set of strings, a.k.a tags, which are known beforehand. It is similar to the task of fuzzy matching. But unlike fuzzy matching, autotagging is typically done with a fixed set of tags, and it treats the unstructured documents as the queries.</p>
           <p><a href="">Read the full story here</a></p>
           <div class="ClearFix"></div>


User 1817 | 4/27/2015, 6:36:23 AM

storiessf = gl.loadsframe("s3://dato-datasets/hackernews/storieswithtext.sframe") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Library/Python/2.7/site-packages/graphlab/datastructures/", line 230, in loadsframe sf = SFrame(data=filename) File "/Library/Python/2.7/site-packages/graphlab/datastructures/", line 860, in init raise ValueError('Unknown input type: ' + format) File "/Library/Python/2.7/site-packages/graphlab/cython/", line 39, in exit raise exctype(excvalue) KeyError: KeyError('No access key found. Please set the environment variable AWSACCESSKEYID, or using',)

how to solve this?

User 1178 | 4/27/2015, 9:41:38 PM

Hi Hao,

Since you need to download data from S3, a valid AWS credentials is needed. You may set your AWS credentials through the following commander inside your Python session:<yourawskeyid>, <yourawskey_credentials>)

You may also set them through your environment variable:

export AWSACCESSKEYID='<your-aws-key-id>' export AWSSECRETACCESSKEY='<your-aws-key-credentials>'


User 1178 | 4/28/2015, 4:55:56 PM

Hi Hao,

To add to previous response, Dato provides a a read only credentials that you can use to download the example dataset above. You can do the following in your Python session:'AKIAJMHKEZGY6YP24BXA', 'vf/miz2Zx7V7VkCai9ZeJR45ZSimqu6/W7qdRLmN')

Hope that helps!