我正在使用此非官方API在特定媒体下检索评论。我对代码进行了稍作修改,因此我不必每次更改媒体ID即可获取评论,因此我的想法基本上是包括这样的媒体列表:
media_list = [media_id1, media_id2, ... ]
并将其传递到一个周期。我的最终输出将是这样的文本文件:
media_id1
username1 comment1
username2 comment2
username3 comment3
media_id2
username1 comment1
...
这就是我修改原始代码的方式:
for i in medialist:
comments = []
while has_more_comments:
_ = API.getMediaComments(i,max_id=max_id)
#comments' page come from older to newer, lets preserve desc order in full list
for c in reversed(API.LastJson['comments']):
comments.append(c)
has_more_comments = API.LastJson.get('has_more_comments',False)
#evaluate stop conditions
if count and len(comments)>=count:
comments = comments[:count]
#stop loop
has_more_comments = False
print "stopped by count"
#next page
if has_more_comments:
max_id = API.LastJson.get('next_max_id','')
time.sleep(2)
for c in comments:
username = c['user']['username']
text = c['text']
user = username.encode('utf-8')
txt = text.encode('utf-8')
print (i+"n"+user+": "+txt+"n")
我的问题是我只从列表中的第一个媒体_id中获取评论,然后为我提供了其他媒体的空列表:
1412361909683907264
[{u'status': u'Active', u'user_id': xxx, u'created_at_utc': xxx, u'created_at': xxx, u'bit_flags': 0, u'comment_like_count': 1, u'did_report_as_spam': False, u'user': {u'username': u'xxx', u'profile_pic_url': u'xxx', u'profile_pic_id': u'xxx', u'full_name': u'xxx', u'pk': xxx, u'is_verified': False, u'is_private': True}, u'content_type': u'comment', u'text': u'When you eat pasta remember me U0001f602U0001f602U0001f602U0001f602U0001f44dU0001f3fbU0001f4aaU0001f3fc', u'pk': xxx, u'type': 0, u'has_liked_comment': False}]
1412360153562726838
[]
1412342538912059069
[]
1412336815465111851
[]
问题在哪里?我显然不是程序员,与Python的能力和经验很低,并以一种爱好为生,如果我犯了一些明显的错误,我仍然不会注意到我谢谢!
我认为您需要在媒体列表中的第一个项目之后将has_more_comments
设置为True
。
for i in medialist:
comments = []
has_more_comments = True
while has_more_comments:
...