如下图所示,有两个数据帧df1和df2。
df1说明每个列名的数量数据。因此,每一列将包含从Floor到Under的数量。
Column name Floor Under ConcType ConcTotal RebarTotal SteelTotal
0 CB1 ACP GF C80 21.6 7954.8 2265.0
1 CB1A B4F LGF C60 1.2 318.8 0.0
...
201 CT12B GF 8F C60 4.5 757.6 840.0
202 CT13A GF 8F C80 1.4 429.3 420.0
df2为隔层充填量数据表。但是df2在fl上不同,name_h (column) mean将包含df1中承载该Floor的所有列的数量。
H表示承载该层的列的高度。按标高楼层-前一楼层标高计算。
例如F8层H = 48400 - 42800 = 5600
Fl.Name_Elev Elevation Fl.Name_h H C60_Conc C80_Conc Rebar Steel
0 8F 48400 8F 5600.0 NaN NaN NaN NaN
1 7F 42800 7F 6000.0 NaN NaN NaN NaN
2 6F 36800 6F 6000.0 NaN NaN NaN NaN
3 5F 30800 5F 5600.0 NaN NaN NaN NaN
4 4F 25200 4F 5600.0 NaN NaN NaN NaN
5 3F 19600 3F 5600.0 NaN NaN NaN NaN
6 2F 14000 2F 6000.0 NaN NaN NaN NaN
7 1F 8000 1F 6000.0 NaN NaN NaN NaN
8 GF 2000 GF 3300.0 NaN NaN NaN NaN
9 LGF MZ -1300 LGF MZ 3150.0 NaN NaN NaN NaN
10 LGF -4450 LGF 4650.0 NaN NaN NaN NaN
11 B1F -9100 B1F 3000.0 NaN NaN NaN NaN
12 B2F -12100 B2F 3000.0 NaN NaN NaN NaN
13 B3F -15100 B3F 4650.0 NaN NaN NaN NaN
14 B4F -19750 B4F 8250.0 NaN NaN NaN NaN
15 ACP -28000 NaN NaN NaN NaN NaN NaN
我想从df1填充到NaN的"C60_Conc", "C80_Conc","钢筋","钢"在df2。
结果如下:
Fl.Name_Elev Elevation Fl.Name_h H C60_Conc C80_Conc Rebar Steel
0 8F 48400 8F 5600.0 4.5 1.4 1186.9 1260
1 7F 42800 7F 6000.0 4.5 1.4 1186.9 1260
2 6F 36800 6F 6000.0 4.5 1.4 1186.9 1260
3 5F 30800 5F 5600.0 4.5 1.4 1186.9 1260
4 4F 25200 4F 5600.0 4.5 1.4 1186.9 1260
5 3F 19600 3F 5600.0 4.5 1.4 1186.9 1260
6 2F 14000 2F 6000.0 4.5 1.4 1186.9 1260
7 1F 8000 1F 6000.0 4.5 1.4 1186.9 1260
8 GF 2000 GF 3300.0 0 21.6 7954.8 2265
9 LGF MZ -1300 LGF MZ 3150.0 0 21.6 7954.8 2265
10 LGF -4450 LGF 4650.0 1.2 21.6 8276.6 2265
11 B1F -9100 B1F 3000.0 1.2 21.6 8276.6 2265
12 B2F -12100 B2F 3000.0 1.2 21.6 8276.6 2265
13 B3F -15100 B3F 4650.0 1.2 21.6 8276.6 2265
14 B4F -19750 B4F 8250.0 0 21.6 7954.8 2265
15 ACP -28000 NaN NaN NaN NaN NaN NaN
注:抱歉我的沟通。英语不是我的母语。
解决方案:
floors = df2["Fl.Name_Elev"].tolist()[::-1]
df1["Floors"] = df1.apply(lambda x: floors[floors.index(x["Floor"]):floors.index(x["Under"])+1], axis=1)
df1 = df1.explode(column=["Floors"])
mapper = df1.pivot(index=df1.columns.difference(["ConcType", "ConcTotal"]),
columns="ConcType",
values="ConcTotal").reset_index().groupby("Floors").sum()
output = df2.merge(mapper, left_on="Fl.Name_Elev", right_index=True, how="left")
output = output.rename(columns={"C60": "C60_Conc", "C80": "C80_Conc"})
>>> output
Fl.Name_Elev Elevation Fl.Name_h ... SteelTotal C60_Conc C80_Conc
0 8F 48400 8F ... 1260 4.5 1.4
1 7F 42800 7F ... 1260 4.5 1.4
2 6F 36800 6F ... 1260 4.5 1.4
3 5F 30800 5F ... 1260 4.5 1.4
4 4F 25200 4F ... 1260 4.5 1.4
5 3F 19600 3F ... 1260 4.5 1.4
6 2F 14000 2F ... 1260 4.5 1.4
7 1F 8000 1F ... 1260 4.5 1.4
8 GF 2000 GF ... 3525 4.5 23.0
9 LGF MZ -1300 LGF MZ ... 2265 0.0 21.6
10 LGF -4450 LGF ... 2265 1.2 21.6
11 B1F -9100 B1F ... 2265 1.2 21.6
12 B2F -12100 B2F ... 2265 1.2 21.6
13 B3F -15100 B3F ... 2265 1.2 21.6
14 B4F -19750 B4F ... 2265 1.2 21.6
15 ACP -28000 NaN ... 2265 0.0 21.6
解释首先,我们使用pandas.explode
包含df1中从Floor到Under的所有楼层:
floors = df2["Fl.Name_Elev"].tolist()[::-1]
df1["Floors"] = df1.apply(lambda x: floors[floors.index(x["Floor"]):floors.index(x["Under"])+1], axis=1)
df1 = df1.explode(column=["Floors"])
>>> df1.head()
Column name Floor Under ConcType ConcTotal RebarTotal SteelTotal Floors
0 CB1 ACP GF C80 21.6 7954.8 2265 ACP
0 CB1 ACP GF C80 21.6 7954.8 2265 B4F
0 CB1 ACP GF C80 21.6 7954.8 2265 B3F
0 CB1 ACP GF C80 21.6 7954.8 2265 B2F
0 CB1 ACP GF C80 21.6 7954.8 2265 B1F
0 CB1 ACP GF C80 21.6 7954.8 2265 LGF
0 CB1 ACP GF C80 21.6 7954.8 2265 LGF MZ
0 CB1 ACP GF C80 21.6 7954.8 2265 GF
1 CB1A B4F LGF C60 1.2 318.8 0 B4F
1 CB1A B4F LGF C60 1.2 318.8 0 B3F
然后,我们通过使用pandas.pivot
来创建一个映射器,以获得C60和C80的单独列,并使用pandas.groupby
和sum
来获得每个楼层的总数:
mapper = df1.pivot(index=df1.columns.difference(["ConcType", "ConcTotal"]),
columns="ConcType",
values="ConcTotal")
.reset_index()
.groupby("Floor").sum()
>>> mapper
ConcType RebarTotal SteelTotal C60 C80
Floors
1F 1186.9 1260 4.5 1.4
2F 1186.9 1260 4.5 1.4
3F 1186.9 1260 4.5 1.4
4F 1186.9 1260 4.5 1.4
5F 1186.9 1260 4.5 1.4
6F 1186.9 1260 4.5 1.4
7F 1186.9 1260 4.5 1.4
8F 1186.9 1260 4.5 1.4
ACP 7954.8 2265 0.0 21.6
B1F 8273.6 2265 1.2 21.6
B2F 8273.6 2265 1.2 21.6
B3F 8273.6 2265 1.2 21.6
B4F 8273.6 2265 1.2 21.6
GF 9141.7 3525 4.5 23.0
LGF 8273.6 2265 1.2 21.6
LGF MZ 7954.8 2265 0.0 21.6
最后,我们在df2和mapper上使用pandas.merge
得到输出:
output = df2.merge(mapper, left_on="Fl.Name_Elev", right_index=True, how="left")
output = output.rename(columns={"C60": "C60_Conc", "C80": "C80_Conc"})