如何在Python中使用Hacker News API



Hacker News发布了一个API,我如何在Python中使用它?

我想得到所有的顶级职位。我试过使用urllib,但我认为我做得不对。

这是我的代码:

import urllib2
response = urllib2.urlopen('https://hacker-news.firebaseio.com/v0/topstories.json?print=pretty')
html = response.read()
print response.read()

它只是打印空

''

我漏了一行,更新了我的代码。

正如@jonrsharpe所解释的,read()只是一次操作。所以,若您打印html,您将得到所有ID的列表。如果你浏览了这个列表,你必须再次提出每个请求,以获得每个id的故事。

首先,您必须将接收到的数据转换为python列表并遍历所有数据。

base_url =  'https://hacker-news.firebaseio.com/v0/item/{}.json?print=pretty'
top_story_ids = json.loads(html)
for story in top_story_ids:
    response = urllib2.urlopen(base_url.format(story))
    print response.read()

代替这一切,您可以使用haxor,它是Hacker News API的Python包装器。以下代码将为您获取热门故事的所有id:

from hackernews import HackerNews
hn = HackerNews()
top_story_ids = hn.top_stories()
# >>> top_story_ids
# [8432709, 8432616, 8433237, ...]

然后你可以通过这个循环并打印所有它们,例如:

for story in top_story_ids:
   print hn.get_item(story)

免责声明:我写了haxor

您应该

print html

而不是

print response.read()

为什么?因为read是一次性操作;完成后,就不能再重复了:

>>>import ullrib2
>>> response = urllib2.urlopen('https://hacker-news.firebaseio.com/v0/topstories.json?print=pretty')
>>> response.read()
'[ 8445087, 8444739, 8444603, 8443981, 8444976, 8443902, 8444252, 8444634, 8444931, 8444272, 8444025, 8441939, 8444510, 8444640, 8443830, 8445076, 8443470, 8444785, 8443028, 8444077, 8444832, 8443841, 8443467, 8443309, 8443187, 8443896, 8444971, 8443360, 8444601, 8443287, 8441095, 8441681, 8441055, 8442712, 8444909, 8443621, 8442596, 8443836, 8442266, 8443298, 8445122, 8443096, 8441699, 8442119, 8442965, 8440486, 8442093, 8443393, 8442067, 8444989, 8440985, 8444622, 8438728, 8442555, 8444880, 8442004, 8443185, 8444370, 8436210, 8437671, 8439641, 8443727, 8441702, 8436309, 8441041, 8437367, 8422087, 8441711, 8438063, 8444212, 8439408, 8442049, 8440989, 8439367, 8438515, 8437403, 8435278, 8442486, 8442730, 8428522, 8438904, 8443450, 8432703, 8430412, 8422928, 8443635, 8439267, 8440191, 8439560, 8437230, 8442556, 8439977, 8444140, 8441682, 8443776, 8441209, 8428632, 8441388, 8422599, 8439547 ]n'
>>> response.read()
''

不过,在您的情况下,您已经将字符串从read分配到名称html,因此您仍然可以访问它


一旦你有了故事ID,你就可以通过'.../v0/item/{item number}.json?print=pretty':访问每个故事ID

>>> response = urllib2.urlopen('https://hacker-news.firebaseio.com/v0/item/8445087.json?print=pretty')
>>> print response.read()
{
  "by" : "lalmachado",
  "id" : 8445087,
  "kids" : [ 8445205, 8445195, 8445173, 8445103 ],
  "score" : 21,
  "text" : "",
  "time" : 1413116430,
  "title" : "Show HN: Powerful ASCII art editor designed for the Mac",
  "type" : "story",
  "url" : "http://monodraw.helftone.com/"
}

在继续之前,您应该通读API文档。json模块也值得一试。

最新更新