如何从存储在词典列表中的Facebook评论中提取回复



我正在处理一个Facebook抓取,然而,我在处理对评论的回复时遇到了困难。

对于评论的集合,这是代码:

import pandas as pd
import facebook_scraper
post_ids = ['1014199301965488']
options = {"comments": True,
"reactors": True,
"allow_extra_requests": True,
}
cookies = "/content/cookies.txt" #it is necessary to generate a Facebook cookie file
replies = []
for post in facebook_scraper.get_posts(post_urls=post_ids, cookies=cookies, options=options):
for p in post['comments_full']:
replies.append(p)

基本上,每条评论都可以有多个答案。据我所知,每个答案都存储在字典列表中。以下是一些答复的例子。

[{'comment_id': '1014587065260045', 'comment_url': 'https://facebook.com/1014587065260045', 'commenter_id': '100002664042251', 'commenter_url': 'https://facebook.com/anderson.ritmoapoesia?fref=nf&rc=p&__tn__=R', 'commenter_name': 'Anderson Ritmoapoesia', 'commenter_meta': None, 'comment_text': 'Boa irmão!nTmj', 'comment_time': datetime.datetime(2015, 8, 17, 0, 0), 'comment_image': 'https://scontent.xx.fbcdn.net/m1/v/t6/An_UvxJXg9tdnLU3Y5qjPi0200MLilhzPXUgxzGjQzUMaNcmjdZA6anyrngvkdub33NZzZhd51fpCAEzNHFhko5aKRFP5fS1w_lKwYrzcNLupv27.png?_nc_eui2=AeH0Z9O-PPSBg9l8FeLeTyUHMiCX3WNpzi0yIJfdY2nOLeM4yQsYnDi7Fo-bVaW2oRmOKEYPCsTFZnVoJbmO2yOH&ccb=10-5&oh=00_AT-4ep4a5bI4Gf173sbCjcAhS7gahF9vcYuM9GaQwJsI9g&oe=6301E8F9&_nc_sid=55e238', 'comment_reactors': [{'name': 'Marcio J J Tomaz', 'link': 'https://facebook.com/marcioroberto.rodriguestomaz?fref=pb', 'type': 'like'}], 'comment_reactions': {'like': 1}, 'comment_reaction_count': 1}]
[{'comment_id': '1014272461958172', 'comment_url': 'https://facebook.com/1014272461958172', 'commenter_id': '100009587231687', 'commenter_url': 'https://facebook.com/cassia.danyelle.94?fref=nf&rc=p&__tn__=R', 'commenter_name': 'Cassia Danyelle', 'commenter_meta': None, 'comment_text': 'Concordo!', 'comment_time': datetime.datetime(2015, 8, 17, 0, 0), 'comment_image': None, 'comment_reactors': [], 'comment_reactions': None, 'comment_reaction_count': None}, {'comment_id': '1014275711957847', 'comment_url': 'https://facebook.com/1014275711957847', 'commenter_id': '1227694094', 'commenter_url': 'https://facebook.com/marcusvinicius.espiritosanto?fref=nf&rc=p&__tn__=R', 'commenter_name': 'Marcus Vinicius Espirito Santo', 'commenter_meta': None, 'comment_text': 'Concordo Marcão a única observação que faço é: a justiça deveria funcionar sempre dessa forma rápida e precisa, como neste caso.', 'comment_time': datetime.datetime(2015, 8, 17, 0, 0), 'comment_image': 'https://scontent.xx.fbcdn.net/m1/v/t6/An_UvxJXg9tdnLU3Y5qjPi0200MLilhzPXUgxzGjQzUMaNcmjdZA6anyrngvkdub33NZzZhd51fpCAEzNHFhko5aKRFP5fS1w_lKwYrzcNLupv27.png?_nc_eui2=AeH0Z9O-PPSBg9l8FeLeTyUHMiCX3WNpzi0yIJfdY2nOLeM4yQsYnDi7Fo-bVaW2oRmOKEYPCsTFZnVoJbmO2yOH&ccb=10-5&oh=00_AT-4ep4a5bI4Gf173sbCjcAhS7gahF9vcYuM9GaQwJsI9g&oe=6301E8F9&_nc_sid=55e238', 'comment_reactors': [{'name': 'Marcos Alexandre de Souza', 'link': 'https://facebook.com/senseimarcos?fref=pb', 'type': 'like'}], 'comment_reactions': {'like': 1}, 'comment_reaction_count': 1}]
[{'comment_id': '1014367808615304', 'comment_url': 'https://facebook.com/1014367808615304', 'commenter_id': '100005145968202', 'commenter_url': 'https://facebook.com/flavioluis.schnurr?fref=nf&rc=p&__tn__=R', 'commenter_name': 'Flavio Luis Schnurr', 'commenter_meta': None, 'comment_text': 'E porque você não morre ! Quem apoia assassinos também é!', 'comment_time': datetime.datetime(2015, 8, 17, 0, 0), 'comment_image': None, 'comment_reactors': [], 'comment_reactions': None, 'comment_reaction_count': None}]
[{'comment_id': '1014222638629821', 'comment_url': 'https://facebook.com/1014222638629821', 'commenter_id': '100009383732423', 'commenter_url': 'https://facebook.com/profile.php?id=100009383732423&fref=nf&rc=p&__tn__=R', 'commenter_name': 'Anerol Ahnuc', 'commenter_meta': None, 'comment_text': 'Hã?', 'comment_time': datetime.datetime(2015, 8, 17, 0, 0), 'comment_image': 'https://scontent.xx.fbcdn.net/m1/v/t6/An_UvxJXg9tdnLU3Y5qjPi0200MLilhzPXUgxzGjQzUMaNcmjdZA6anyrngvkdub33NZzZhd51fpCAEzNHFhko5aKRFP5fS1w_lKwYrzcNLupv27.png?_nc_eui2=AeH0Z9O-PPSBg9l8FeLeTyUHMiCX3WNpzi0yIJfdY2nOLeM4yQsYnDi7Fo-bVaW2oRmOKEYPCsTFZnVoJbmO2yOH&ccb=10-5&oh=00_AT-4ep4a5bI4Gf173sbCjcAhS7gahF9vcYuM9GaQwJsI9g&oe=6301E8F9&_nc_sid=55e238', 'comment_reactors': [], 'comment_reactions': {'like': 1}, 'comment_reaction_count': 1}, {'comment_id': '1014236578628427', 'comment_url': 'https://facebook.com/1014236578628427', 'commenter_id': '100009383732423', 'commenter_url': 'https://facebook.com/profile.php?id=100009383732423&fref=nf&rc=p&__tn__=R', 'commenter_name': 'Anerol Ahnuc', 'commenter_meta': None, 'comment_text': 'Eu hein?', 'comment_time': datetime.datetime(2015, 8, 17, 0, 0), 'comment_image': None, 'comment_reactors': [], 'comment_reactions': None, 'comment_reaction_count': None}]
[{'comment_id': '1014435731941845', 'comment_url': 'https://facebook.com/1014435731941845', 'commenter_id': '100003779689547', 'commenter_url': 'https://facebook.com/marcia.pimentel.5454?fref=nf&rc=p&__tn__=R', 'commenter_name': 'Márcia Pimentel', 'commenter_meta': None, 'comment_text': 'Não é que sejam defensores Marcondes Martins,sim,eles falam que ele era um ser humano que errou e que podia ter pago de outra maneira,e não com a morte,porque só quem tem direito de tirar a vida das pessoas é Aquele que nos deu... Jesus.', 'comment_time': datetime.datetime(2015, 8, 17, 0, 0), 'comment_image': None, 'comment_reactors': [], 'comment_reactions': None, 'comment_reaction_count': None}, {'comment_id': '1014445965274155', 'comment_url': 'https://facebook.com/1014445965274155', 'commenter_id': '100000515531313', 'commenter_url': 'https://facebook.com/DJ.Marcondes.Martins?fref=nf&rc=p&__tn__=R', 'commenter_name': 'Marcondes Martins', 'commenter_meta': None, 'comment_text': 'Marcia Márcia Pimentel ta teoria é tudo bonitinho. Mas bandidos matam, estupram, humilham pessoas de bem e a justiça ainda protege esses vermes, a sociedade ja está cansada disso.', 'comment_time': datetime.datetime(2015, 8, 17, 0, 0), 'comment_image': None, 'comment_reactors': [], 'comment_reactions': None, 'comment_reaction_count': None}]

根据上面的数据,我只需要"comment_text"的值,然而,我从未使用过这种类型的结构。是否可以提取"comment_text"中的每个出现项?

由于您使用的是字典列表,我会使用列表理解来循环列表中的项目,然后只从每个字典中提取我想要的关键字:

replies.append([reply['comment_text'] for reply in p])

的一个例子

p = [{'comment_id': '1014272461958172', 'comment_url': 'https://facebook.com/1014272461958172', 'commenter_id': '100009587231687', 'commenter_url': 'https://facebook.com/cassia.danyelle.94?fref=nf&rc=p&__tn__=R', 'commenter_name': 'Cassia Danyelle', 'commenter_meta': None, 'comment_text': 'Concordo!', 'comment_time': datetime.datetime(2015, 8, 17, 0, 0), 'comment_image': None, 'comment_reactors': [], 'comment_reactions': None, 'comment_reaction_count': None}, {'comment_id': '1014275711957847', 'comment_url': 'https://facebook.com/1014275711957847', 'commenter_id': '1227694094', 'commenter_url': 'https://facebook.com/marcusvinicius.espiritosanto?fref=nf&rc=p&__tn__=R', 'commenter_name': 'Marcus Vinicius Espirito Santo', 'commenter_meta': None, 'comment_text': 'Concordo Marcão a única observação que faço é: a justiça deveria funcionar sempre dessa forma rápida e precisa, como neste caso.', 'comment_time': datetime.datetime(2015, 8, 17, 0, 0), 'comment_image': 'https://scontent.xx.fbcdn.net/m1/v/t6/An_UvxJXg9tdnLU3Y5qjPi0200MLilhzPXUgxzGjQzUMaNcmjdZA6anyrngvkdub33NZzZhd51fpCAEzNHFhko5aKRFP5fS1w_lKwYrzcNLupv27.png?_nc_eui2=AeH0Z9O-PPSBg9l8FeLeTyUHMiCX3WNpzi0yIJfdY2nOLeM4yQsYnDi7Fo-bVaW2oRmOKEYPCsTFZnVoJbmO2yOH&ccb=10-5&oh=00_AT-4ep4a5bI4Gf173sbCjcAhS7gahF9vcYuM9GaQwJsI9g&oe=6301E8F9&_nc_sid=55e238', 'comment_reactors': [{'name': 'Marcos Alexandre de Souza', 'link': 'https://facebook.com/senseimarcos?fref=pb', 'type': 'like'}], 'comment_reactions': {'like': 1}, 'comment_reaction_count': 1}]
print([reply['comment_text'] for reply in p]) # ['Concordo!', 'Concordo Marcão a única observação que faço é: a justiça deveria funcionar sempre dessa forma rápida e precisa, como neste caso.']

最新更新