我需要一些方法来逐行检查";小时;列,使得当df1";小时;数据跳过一个小时,这一行将在df2中删除。然后,最终删除额外行之后的df2的长度将与df1的长度相匹配。我试过使用isin,但它对我来说不起作用,可能是因为每一天的工作时间都在重复。我有两个df,df1和df2。df1看起来像这样:
plant_name wind_speed_obs hour day month year
0 BIG HORN I 4.354742 1 1 1 2018
1 BIG HORN I 4.493089 2 1 1 2018
2 BIG HORN I 3.270214 3 1 1 2018
3 BIG HORN I 2.201387 4 1 1 2018
4 BIG HORN I 1.107117 5 1 1 2018
5 BIG HORN I 0.653544 6 1 1 2018
6 BIG HORN I 0.437724 7 1 1 2018
7 BIG HORN I 1.039667 8 1 1 2018
8 BIG HORN I 0.859894 9 1 1 2018
9 BIG HORN I 0.984382 10 1 1 2018
10 BIG HORN I 0.867333 11 1 1 2018
11 BIG HORN I 0.651906 12 1 1 2018
12 BIG HORN I 0.707006 13 1 1 2018
13 BIG HORN I 0.794844 14 1 1 2018
14 BIG HORN I 0.808548 15 1 1 2018
15 BIG HORN I 0.631703 16 1 1 2018
16 BIG HORN I 0.662685 17 1 1 2018
17 BIG HORN I 0.792321 18 1 1 2018
18 BIG HORN I 0.996753 19 1 1 2018
19 BIG HORN I 1.177580 20 1 1 2018
20 BIG HORN I 1.608482 21 1 1 2018
21 BIG HORN I 1.964004 22 1 1 2018
22 BIG HORN I 1.695751 23 1 1 2018
24 BIG HORN I 2.244386 1 2 1 2018
25 BIG HORN I 3.111387 2 2 1 2018
df2看起来是这样的:
plant_name wind_speed_ms hour day month year
0 BIG HORN I 3.6 1 1 1 2018
1 BIG HORN I 3.1 2 1 1 2018
2 BIG HORN I 3.1 3 1 1 2018
3 BIG HORN I 2.0 4 1 1 2018
4 BIG HORN I 1.6 5 1 1 2018
5 BIG HORN I 0.8 6 1 1 2018
6 BIG HORN I 0.8 7 1 1 2018
7 BIG HORN I 1.0 8 1 1 2018
8 BIG HORN I 0.3 9 1 1 2018
9 BIG HORN I 0.1 10 1 1 2018
10 BIG HORN I 1.1 11 1 1 2018
11 BIG HORN I 1.9 12 1 1 2018
12 BIG HORN I 1.9 13 1 1 2018
13 BIG HORN I 1.0 14 1 1 2018
14 BIG HORN I 0.7 15 1 1 2018
15 BIG HORN I 2.1 16 1 1 2018
16 BIG HORN I 3.5 17 1 1 2018
17 BIG HORN I 2.1 18 1 1 2018
18 BIG HORN I 1.3 19 1 1 2018
19 BIG HORN I 2.3 20 1 1 2018
20 BIG HORN I 2.8 21 1 1 2018
21 BIG HORN I 3.0 22 1 1 2018
22 BIG HORN I 2.5 23 1 1 2018
23 BIG HORN I 2.2 0 2 1 2018
24 BIG HORN I 3.9 1 2 1 2018
25 BIG HORN I 4.3 2 2 1 2018
26 BIG HORN I 3.5 3 2 1 2018
在列"0"中找到不匹配的小时之后;小时;df2(参见上面的索引=23(;0";在df1中找不到的小时行,df2数据帧应该看起来像这样;0";已删除小时行:新df2:--谢谢!
plant_name wind_speed_ms hour day month year
0 BIG HORN I 3.6 1 1 1 2018
1 BIG HORN I 3.1 2 1 1 2018
2 BIG HORN I 3.1 3 1 1 2018
3 BIG HORN I 2.0 4 1 1 2018
4 BIG HORN I 1.6 5 1 1 2018
5 BIG HORN I 0.8 6 1 1 2018
6 BIG HORN I 0.8 7 1 1 2018
7 BIG HORN I 1.0 8 1 1 2018
8 BIG HORN I 0.3 9 1 1 2018
9 BIG HORN I 0.1 10 1 1 2018
10 BIG HORN I 1.1 11 1 1 2018
11 BIG HORN I 1.9 12 1 1 2018
12 BIG HORN I 1.9 13 1 1 2018
13 BIG HORN I 1.0 14 1 1 2018
14 BIG HORN I 0.7 15 1 1 2018
15 BIG HORN I 2.1 16 1 1 2018
16 BIG HORN I 3.5 17 1 1 2018
17 BIG HORN I 2.1 18 1 1 2018
18 BIG HORN I 1.3 19 1 1 2018
19 BIG HORN I 2.3 20 1 1 2018
20 BIG HORN I 2.8 21 1 1 2018
21 BIG HORN I 3.0 22 1 1 2018
22 BIG HORN I 2.5 23 1 1 2018
24 BIG HORN I 3.9 1 2 1 2018
25 BIG HORN I 4.3 2 2 1 2018
26 BIG HORN I 3.5 3 2 1 2018
对所有日期/时间列使用isin
:
df2 = df2[df2['hour'].isin(df1['hour']) &
df2['day'].isin(df1['day']) &
df2['month'].isin(df1['month']) &
df2['year'].isin(df1['year'])]
df2
Out[1]:
plant_name wind_speed_ms hour day month year
0 BIG HORN I 3.6 1 1 1 2018
1 BIG HORN I 3.1 2 1 1 2018
2 BIG HORN I 3.1 3 1 1 2018
3 BIG HORN I 2.0 4 1 1 2018
4 BIG HORN I 1.6 5 1 1 2018
5 BIG HORN I 0.8 6 1 1 2018
6 BIG HORN I 0.8 7 1 1 2018
7 BIG HORN I 1.0 8 1 1 2018
8 BIG HORN I 0.3 9 1 1 2018
9 BIG HORN I 0.1 10 1 1 2018
10 BIG HORN I 1.1 11 1 1 2018
11 BIG HORN I 1.9 12 1 1 2018
12 BIG HORN I 1.9 13 1 1 2018
13 BIG HORN I 1.0 14 1 1 2018
14 BIG HORN I 0.7 15 1 1 2018
15 BIG HORN I 2.1 16 1 1 2018
16 BIG HORN I 3.5 17 1 1 2018
17 BIG HORN I 2.1 18 1 1 2018
18 BIG HORN I 1.3 19 1 1 2018
19 BIG HORN I 2.3 20 1 1 2018
20 BIG HORN I 2.8 21 1 1 2018
21 BIG HORN I 3.0 22 1 1 2018
22 BIG HORN I 2.5 23 1 1 2018
24 BIG HORN I 3.9 1 2 1 2018 #row with index of 23 removed
25 BIG HORN I 4.3 2 2 1 2018
26 BIG HORN I 3.5 3 2 1 2018