假设我有一个像这样的数据帧:
Col1 Col2 Tag_history New_tag Col5 created
0 Name1 Value1 Tag10 Tag10 Rank4 2021-03-21 12:58:09
1 Name1 Value2 Tag10 Tag10 Rank4 2021-03-21 13:58:09
2 Name1 Value3 Tag10 Tag10 Rank4 2021-03-21 14:58:09
3 Name2 Value1 Tag8 Tag9 Rank1 2021-03-21 10:58:09
4 Name2 Value2 Tag8 Tag9 Rank1 2021-03-21 11:58:09
5 Name2 Value4 Tag8 Tag9 Rank1 2021-03-21 12:58:09
6 Name2 Value5 Tag8 Tag9 Rank1 2021-03-21 13:58:09
因此,我想比较列Tag_history和New标记,如果标记已经更改,我想添加一个新行,在Tag_histry中也显示新的标记。例如,对于Name2,标签已经从Tag8更改为Tag9,所以我希望我的df看起来像这样:
Col1 Col2 Tag_history New_tag Col5 created
0 Name1 Value1 Tag10 Tag10 Rank4 2021-03-21 12:58:09
1 Name1 Value2 Tag10 Tag10 Rank4 2021-03-21 13:58:09
2 Name1 Value3 Tag10 Tag10 Rank4 2021-03-21 14:58:09
3 Name2 Value1 Tag8 Tag9 Rank1 2021-03-21 10:58:09
4 Name2 Value2 Tag8 Tag9 Rank1 2021-03-21 11:58:09
5 Name2 Value4 Tag8 Tag9 Rank1 2021-03-21 12:58:09
6 Name2 Value5 Tag8 Tag9 Rank1 2021-03-21 13:58:09
7 Name2 IDLE Tag9 Tag9 Rank1 2022-01-24 16:50:00 (current datetime)
首先,我不建议使用任何循环,因为它们不是很有效。
different_value = df[~(df['Tag_history'] == df['New_tag'])] #First check and search for rows that contains different "Tag_history" and "New_tag"
different_value.loc[:,'New_tag'] = different_value['Tag_history'] #Create the new rows
df = df.append(different_value, ignore_index = True) # append dataframes