使用将数据元组添加到字典来存储json文件中的数据



我有一个json文件,里面有很多这样的索赔,我正试图循环浏览该文件,对于reviewDate中的每个唯一年份,将每个唯一的索赔人存储在一个计数器中,以显示其出现的频率

{
"text": "“This president, though, for immigrants, there is nothing he will not do to separate a family, cage a child, or erase their existence by weaponizing the census."",
"claimant": "Eric Swalwell",
"claimDate": "2019-06-27T00:00:00Z",
"claimReview": [
{
"publisher": {
"name": "PolitiFact",
"site": "politifact.com"
},
"url": "https://www.politifact.com/article/2019/jun/28/fact-checking-2nd-night-democratic-debate-miami/",
"title": "Fact-checking the 2nd night of the Democratic debate in Miami",
"reviewDate": "2019-06-28T16:49:26Z",
"textualRating": "Frequent attack needs context",
"languageCode": "en"
}
]
},

我现在有这个脚本,但它只是为每个条目添加一个新条目,而不是在字典中找到索赔人并递增其计数器

def split_by_year(data):
year_dict = {}
claimant_dict = {}
counter = 0
# for every claim in the file
for claim in data['claims']:
# placeholder for year & claimant
year = ''
claimant = ''
if 'claimant' in claim:
claimant = claim['claimant']
# the reviewDate is in the review so we go into it
for review in claim['claimReview']:
# if the review date exists
if 'reviewDate' in review.keys():
# get the year
year = review['reviewDate'][0:4]
if year in year_dict:
# loop through to find the claimant
if claimant in year_dict[year]:
counter += 1
year_dict[year][1] += 1
else:
# claimant doesnt exist
year_dict[year].append([claimant, 1])
else:
# year not in year_dict. Add w/ counter
year_dict[year] = [claimant, 1]

这是当前输出

'2019': ['Eric Swalwell',
3,
['Ted Budd', 1],
['Donald Trump', 1],
['Henry Cuellar', 1],
['Mike Pence', 1],
['Mike Pence', 1],
['Michael Bennet', 1],
['Facebook posts', 1],
['Donald Trump', 1],
['Mark Walker', 1],

我不知道如何正确地将索赔人添加到每年的柜台下。然后检查索赔人是否已经被添加,以增加计数器

从集合导入defaultdict,Counter

def split_by_year(数据(:year_dict=defaultdict(计数器(

# for every claim in the file
for claim in data['claims']:
if 'claimant' in claim:
claimant = claim['claimant']
else:
continue # skip this one, move to next claim
for review in claim['claimReview']:
if 'reviewDate' in review.keys():
year = review['reviewDate'][0:4]
else:
continue # skip this one, move to next claimReview
year_dict[year][claimant] += 1
return year_dict

结果=split_by_year(数据(打印(结果[‘2019’]["Eric Swalwell"](

最新更新