我有4个数据帧用于covid病例,我想将它们连接起来绘制它们:
df1:
Date active
0 March 29
1 April 3332
2 May 8257
3 June 5912
4 July 11418
5 August 11292
6 September 4386
7 October 1024
8 November 1883
9 December 1934
10 January 1653
11 February 255
df2:
Date cases
0 March 6
1 April 241
2 May 637
3 June 671
4 July 1512
5 August 1304
6 September 271
7 October 72
8 November 182
9 December 152
10 January 68
11 February 14
df3:
Date deaths
0 April 1
1 May 2
2 June 14
3 July 29
4 August 13
5 September 10
6 October 9
7 November 2
8 December 3
9 January 3
df4:
Date recovories
0 April 43
1 May 652
2 June 704
3 July 1239
4 August 1259
5 September 632
6 October 69
7 November 150
8 December 148
9 January 78
10 February 16
当我连接它们时,我期望有5列:(日期、病例、活动、死亡、康复)和11行,但发生了这种情况(它们重复自己):
Date active Date cases Date deaths Date recovories
0 March 29 March 6 April 1.0 April 43.0
1 April 3332 April 241 May 2.0 May 652.0
2 May 8257 May 637 June 14.0 June 704.0
3 June 5912 June 671 July 29.0 July 1239.0
4 July 11418 July 1512 August 13.0 August 1259.0
5 August 11292 August 1304 September 10.0 September 632.0
6 September 4386 September 271 October 9.0 October 69.0
7 October 1024 October 72 November 2.0 November 150.0
8 November 1883 November 182 December 3.0 December 148.0
9 December 1934 December 152 January 3.0 January 78.0
10 January 1653 January 68 0 0.0 February 16.0
11 February 255 February 14 0 0.0 0 0.0
如何防止这种情况发生,下面是代码:
all= [df1, df2, df3, df4]
df_new = pd.concat(all, axis=1)
df_new = df_new.fillna(0)
info: Windows 10 python 3.9.1初学者
首先将每个DataFrame
的Date
转换为DatetimeIndex
:
dfs = [df1, df2, df3, df4]
df_new = pd.concat([x.set_index('Date') for x in dfs], axis=1).fillna(0)