2 Pandas-查找不匹配的行并删除不匹配小时的额外行



我需要一些方法来逐行检查";小时;列,使得当df1";小时;数据跳过一个小时,这一行将在df2中删除。然后,最终删除额外行之后的df2的长度将与df1的长度相匹配。我试过使用isin,但它对我来说不起作用,可能是因为每一天的工作时间都在重复。我有两个df,df1和df2。df1看起来像这样:

plant_name  wind_speed_obs  hour  day  month  year
0   BIG HORN I        4.354742     1    1      1  2018
1   BIG HORN I        4.493089     2    1      1  2018
2   BIG HORN I        3.270214     3    1      1  2018
3   BIG HORN I        2.201387     4    1      1  2018
4   BIG HORN I        1.107117     5    1      1  2018
5   BIG HORN I        0.653544     6    1      1  2018
6   BIG HORN I        0.437724     7    1      1  2018
7   BIG HORN I        1.039667     8    1      1  2018
8   BIG HORN I        0.859894     9    1      1  2018
9   BIG HORN I        0.984382    10    1      1  2018
10  BIG HORN I        0.867333    11    1      1  2018
11  BIG HORN I        0.651906    12    1      1  2018
12  BIG HORN I        0.707006    13    1      1  2018
13  BIG HORN I        0.794844    14    1      1  2018
14  BIG HORN I        0.808548    15    1      1  2018
15  BIG HORN I        0.631703    16    1      1  2018
16  BIG HORN I        0.662685    17    1      1  2018
17  BIG HORN I        0.792321    18    1      1  2018
18  BIG HORN I        0.996753    19    1      1  2018
19  BIG HORN I        1.177580    20    1      1  2018
20  BIG HORN I        1.608482    21    1      1  2018
21  BIG HORN I        1.964004    22    1      1  2018
22  BIG HORN I        1.695751    23    1      1  2018
24  BIG HORN I        2.244386     1    2      1  2018
25  BIG HORN I        3.111387     2    2      1  2018

df2看起来是这样的:

plant_name  wind_speed_ms  hour  day  month  year
0   BIG HORN I            3.6     1    1      1  2018
1   BIG HORN I            3.1     2    1      1  2018
2   BIG HORN I            3.1     3    1      1  2018
3   BIG HORN I            2.0     4    1      1  2018
4   BIG HORN I            1.6     5    1      1  2018
5   BIG HORN I            0.8     6    1      1  2018
6   BIG HORN I            0.8     7    1      1  2018
7   BIG HORN I            1.0     8    1      1  2018
8   BIG HORN I            0.3     9    1      1  2018
9   BIG HORN I            0.1    10    1      1  2018
10  BIG HORN I            1.1    11    1      1  2018
11  BIG HORN I            1.9    12    1      1  2018
12  BIG HORN I            1.9    13    1      1  2018
13  BIG HORN I            1.0    14    1      1  2018
14  BIG HORN I            0.7    15    1      1  2018
15  BIG HORN I            2.1    16    1      1  2018
16  BIG HORN I            3.5    17    1      1  2018
17  BIG HORN I            2.1    18    1      1  2018
18  BIG HORN I            1.3    19    1      1  2018
19  BIG HORN I            2.3    20    1      1  2018
20  BIG HORN I            2.8    21    1      1  2018
21  BIG HORN I            3.0    22    1      1  2018
22  BIG HORN I            2.5    23    1      1  2018
23  BIG HORN I            2.2     0    2      1  2018
24  BIG HORN I            3.9     1    2      1  2018
25  BIG HORN I            4.3     2    2      1  2018
26  BIG HORN I            3.5     3    2      1  2018

在列"0"中找到不匹配的小时之后;小时;df2(参见上面的索引=23(;0";在df1中找不到的小时行,df2数据帧应该看起来像这样;0";已删除小时行:新df2:--谢谢!

plant_name  wind_speed_ms  hour  day  month  year
0   BIG HORN I            3.6     1    1      1  2018
1   BIG HORN I            3.1     2    1      1  2018
2   BIG HORN I            3.1     3    1      1  2018
3   BIG HORN I            2.0     4    1      1  2018
4   BIG HORN I            1.6     5    1      1  2018
5   BIG HORN I            0.8     6    1      1  2018
6   BIG HORN I            0.8     7    1      1  2018
7   BIG HORN I            1.0     8    1      1  2018
8   BIG HORN I            0.3     9    1      1  2018
9   BIG HORN I            0.1    10    1      1  2018
10  BIG HORN I            1.1    11    1      1  2018
11  BIG HORN I            1.9    12    1      1  2018
12  BIG HORN I            1.9    13    1      1  2018
13  BIG HORN I            1.0    14    1      1  2018
14  BIG HORN I            0.7    15    1      1  2018
15  BIG HORN I            2.1    16    1      1  2018
16  BIG HORN I            3.5    17    1      1  2018
17  BIG HORN I            2.1    18    1      1  2018
18  BIG HORN I            1.3    19    1      1  2018
19  BIG HORN I            2.3    20    1      1  2018
20  BIG HORN I            2.8    21    1      1  2018
21  BIG HORN I            3.0    22    1      1  2018
22  BIG HORN I            2.5    23    1      1  2018
24  BIG HORN I            3.9     1    2      1  2018
25  BIG HORN I            4.3     2    2      1  2018
26  BIG HORN I            3.5     3    2      1  2018

对所有日期/时间列使用isin

df2 = df2[df2['hour'].isin(df1['hour']) &
df2['day'].isin(df1['day']) &
df2['month'].isin(df1['month']) & 
df2['year'].isin(df1['year'])]
df2
Out[1]: 
plant_name  wind_speed_ms  hour  day  month  year
0   BIG HORN I            3.6     1    1      1  2018
1   BIG HORN I            3.1     2    1      1  2018
2   BIG HORN I            3.1     3    1      1  2018
3   BIG HORN I            2.0     4    1      1  2018
4   BIG HORN I            1.6     5    1      1  2018
5   BIG HORN I            0.8     6    1      1  2018
6   BIG HORN I            0.8     7    1      1  2018
7   BIG HORN I            1.0     8    1      1  2018
8   BIG HORN I            0.3     9    1      1  2018
9   BIG HORN I            0.1    10    1      1  2018
10  BIG HORN I            1.1    11    1      1  2018
11  BIG HORN I            1.9    12    1      1  2018
12  BIG HORN I            1.9    13    1      1  2018
13  BIG HORN I            1.0    14    1      1  2018
14  BIG HORN I            0.7    15    1      1  2018
15  BIG HORN I            2.1    16    1      1  2018
16  BIG HORN I            3.5    17    1      1  2018
17  BIG HORN I            2.1    18    1      1  2018
18  BIG HORN I            1.3    19    1      1  2018
19  BIG HORN I            2.3    20    1      1  2018
20  BIG HORN I            2.8    21    1      1  2018
21  BIG HORN I            3.0    22    1      1  2018
22  BIG HORN I            2.5    23    1      1  2018
24  BIG HORN I            3.9     1    2      1  2018  #row with index of 23 removed
25  BIG HORN I            4.3     2    2      1  2018
26  BIG HORN I            3.5     3    2      1  2018

最新更新