我的两个数据帧:
wetter
Out[223]:
level_0 index TEMPERATURE:TOTAL SLP HOUR Time
0 0 2018-01-01 00:00:00 9.8 NaN 00 00:00
1 1 2018-01-01 01:00:00 9.8 NaN 01 01:00
2 2 2018-01-01 02:00:00 9.2 NaN 02 02:00
3 3 2018-01-01 03:00:00 8.4 NaN 03 03:00
4 4 2018-01-01 04:00:00 8.5 NaN 04 04:00
... ... ... ... ... ...
49034 49034 2018-12-31 22:40:00 8.5 NaN 22 22:40
49035 49035 2018-12-31 22:45:00 8.4 NaN 22 22:45
49036 49036 2018-12-31 22:50:00 8.4 NaN 22 22:50
49037 49037 2018-12-31 22:55:00 8.4 NaN 22 22:55
49038 49038 2018-12-31 23:00:00 8.4 NaN 23 23:00
[49039 rows x 6 columns]
df
Out[224]:
0 Time -14 -13 ... 17 18 NaN
1 00:00 1,256326635 1,218256131 ... 0,080348715 0,040194189 00:15
2 00:15 1,256564788 1,218487067 ... 0,080254367 0,039517006 00:30
3 00:30 1,260350982 1,222158528 ... 0,080219518 0,039054261 00:45
4 00:45 1,259306606 1,221145800 ... 0,080758578 0,039176953 01:00
5 01:00 1,258521518 1,220384502 ... 0,080444585 0,038164953 01:15
.. ... ... ... ... ... ... ...
92 22:45 1,253545107 1,215558891 ... 0,080164570 0,042697436 23:00
93 23:00 1,241253483 1,203639741 ... 0,078395829 0,039685235 23:15
94 23:15 1,242890274 1,205226933 ... 0,078801415 0,039170364 23:30
95 23:30 1,240459118 1,202869448 ... 0,079511294 0,039013684 23:45
96 23:45 1,236228281 1,198766818 ... 0,079186806 0,037570494 00:00
[96 rows x 35 columns]
我想在TEMPERATURE:TOTAL
和Time
的基础上填充wetter
的SLP
列。为此,我想查看df
数据帧,并根据df
的列填充SLP,其中标题是温度。
因此,对于00:00
处9.8
的第一个TEMPERATURE:TOTAL
,SLP
应填充Time
的00:00行中简称为9
的列的值
我尝试过这样做,这就是为什么我也创建了时间列,但我被卡住了。我想到了一些嵌套的循环,但我知道一些熊猫,我想可能有一个双线解决方案?
这里有一种方法!
import numpy as np
import pandas as pd
这是我模拟你的数据帧(你可以跳过这一步(-下次请提供它们。
wetter = pd.DataFrame()
df = pd.DataFrame()
wetter['TEMPERATURE:TOTAL'] = np.random.rand(10) * 10
wetter['SLP'] = np.nan * 10
wetter['Time'] = pd.date_range("00:00", periods=10, freq="H")
df['Time'] = pd.date_range("00:00", periods=10, freq="15T")
for i in range(-14, 18):
df[i] = np.random.rand(10)
预处理:
wetter['temp'] = np.floor(wetter['TEMPERATURE:TOTAL'])
wetter = wetter.astype({'temp': 'int'})
wetter.set_index('Time')
df.set_index('Time')
df = df.reset_index()
value_vars_ = list(range(-14, 18))
df_long = pd.melt(df, id_vars='Time', value_vars=value_vars_, var_name='temp', value_name="SLP")
左连接Time
和temp
上的两个数据帧:
final = pd.merge(wetter.drop('SLP', axis=1), df_long, how="left", on=["Time", "temp"])