(About) 425 days ago (at the time of this writing) I started scraping Hacker News via its shiny new API. And then I promptly forgot about it. That is, until I noticed my cronjob had been throwing errors constantly for a few weeks:
Traceback (most recent call last):
File "/home/dummy/projects/hn-cron/hn.py", line 62, in <module>
main()
File "/home/dummy/projects/hn-cron/hn.py", line 53, in main
log_line = str(details['id']) + "\t" + details['title'] + "\t" + details['url'] + "\t" + str(details['score']) + "\n"
KeyError: 'url'Instead of fixing anything, I just commented out the cronjob. But now I feel somewhat obligated to do at least a rudimentary analysis of this data. In keeping with my extreme negligence/laziness throughout this project, I hacked together a few bash commands to do just that.
