尝试创建一个可以从sub_ids列表中提取注释的 PRAW 抓取器。仅返回最后sub_ids注释数据。
我猜我一定是在覆盖某些东西。我已经查看了其他问题,但由于我使用的是 PLAW它有特定的参数,我无法弄清楚可以/应该替换什么。
sub_ids = ["2ypash", "7ocvlb", "7okxkf"]
for sub_id in sub_ids:
submission = reddit.submission(id=sub_id)
submission.comments.replace_more(limit=None, threshold=0)
comments = submission.comments.list()
commentlist = []
for comment in comments:
commentsdata = {}
commentsdata["id"] = comment.id
commentsdata["subreddit"] = str(submission.subreddit)
commentsdata["thread"] = str(submission.title)
commentsdata["author"] = str(comment.author)
commentsdata["body"] = str(comment.body)
commentsdata["score"] = comment.score
commentsdata["created_utc"] = datetime.datetime.fromtimestamp(comment.created_utc)
commentsdata["parent_id"] = comment.parent_id
commentlist.append(commentsdata)
进是你的失败。 代码失败的原因是comments
仅在sub_ids
完成循环后分配。 所以当你遍历comments
时,它们只是最后一个sub_id
comments
。
首先,在两个for
循环之前将commentlist = []
移出(使其紧跟在第 1 行之后(
其次,从comments = submission.comments.list()
(含(开始的所有内容都需要缩进,以便在sub_ids
迭代中运行。
这是它最终应该的样子:
sub_ids = ["2ypash", "7ocvlb", "7okxkf"]
commentlist = []
for sub_id in sub_ids:
submission = reddit.submission(id=sub_id)
submission.comments.replace_more(limit=None, threshold=0)
comments = submission.comments.list()
for comment in comments:
commentsdata = {}
commentsdata["id"] = comment.id
commentsdata["subreddit"] = str(submission.subreddit)
commentsdata["thread"] = str(submission.title)
commentsdata["author"] = str(comment.author)
commentsdata["body"] = str(comment.body)
commentsdata["score"] = comment.score
commentsdata["created_utc"] = datetime.datetime.fromtimestamp(comment.created_utc)
commentsdata["parent_id"] = comment.parent_id
commentlist.append(commentsdata)