我有四列数据,我正在尝试根据两个条件进行合并。数据的格式如下:
CountyName Year Oil Gas
ANDERSON 2010 1358 0
ANDERSON 2010 621746 4996766
ANDERSON 2011 1587 0
ANDERSON 2011 633120 5020877
ANDERSON 2012 55992 387685
ANDERSON 2012 1342 0
ANDERSON 2013 635572 3036578
ANDERSON 2013 4873 0
ANDERSON 2014 656440 2690333
ANDERSON 2014 12332 0
ANDERSON 2015 608454 2836272
ANDERSON 2015 23339 0
ANDERSON 2016 551728 2682261
ANDERSON 2016 12716 0
ANDERSON 2017 132466 567874
ANDERSON 2017 1709 0
ANDREWS 2010 25701725 1860063
ANDREWS 2010 106351 0
ANDREWS 2011 97772 0
ANDREWS 2011 28818329 1377865
ANDREWS 2012 105062 0
...
我有兴趣为重复的条目组合各自的石油和天然气值。例如,我想添加 2010 年安德森县的所有石油条目,并让该值替换一行中的现有条目。我现在使用的代码是将相应县的所有值相加,而不考虑年份,给我一个浓缩的输出,如下所示:
CountyName Year Oil Gas
ANDERSON 3954774
ANDREWS 206472698
...
这是我使用的代码:
import csv
with open('Texas.csv', 'r') as Texas: #opening Texas csv file
TexasReader = csv.reader(Texas)
counties = {}
years = {}
index = 0 and 1
for row in TexasReader:
if index == 0 and 1:
header = row
else:
county = row[0]
year = row[1]
oil = row[2]
gas = row[3]
if county in counties:
counties[county] += int(oil)
else:
counties[county] = int(oil)
index += 1
with open('TexasConsolidated.csv', 'w') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=header, delimiter=',', lineterminator='n')
writer.writeheader()
for k, v in counties.items():
writer.writerow({header[0]: k, header[2]: v})
这是正在做你抱怨的事情的行:
if county in counties:
counties[county] += int(oil)
如果您想要一个在两个键上存储总和的dict
,则两个值都需要在dict
键中。
添加行
counties_years = {}
然后像这样求和,使用元组(
县,
年)
作为键:
if (county,year) in counties_years:
counties_years[(county,year)] += int(oil)
else:
counties_years[(county,year)] = int(oil)