熊猫:迭代并在组复杂问题中插入带有条件的列



我有一个相当复杂的问题,关于如何为每个组添加一个带有条件的新列。下面是示例数据帧,

df = pd.DataFrame({
'id': ['AA', 'AA', 'AA', 'AA', 'BB', 'BB', 'BB', 'BB', 'BB',
'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC'],
'From_num': [80, 68, 751, 'Issued', 32, 68, 126, 'Issued', 'Missed', 105, 68, 114, 76, 68, 99, 'Missed'],
'To_num':[99, 80, 68, 751, 105, 32, 68, 126, 49, 324, 105, 68, 114, 76, 68, 99],
})
id From_num  To_num
0   AA       80      99
1   AA       68      80
2   AA      751      68
3   AA   Issued     751
4   BB       32     105
5   BB       68      32
6   BB      126      68
7   BB   Issued     126
8   BB   Missed      49
9   CC      105     324
10  CC       68     105
11  CC      114      68
12  CC       76     114
13  CC       68      76
14  CC       99      68
15  CC   Missed      99

我有一个"旗帜"编号68。在每个组中,对于等于或高于"From_num"列中此标志编号的任何行将在新列中标记为"前进",任何等于或低于"To_num"列中标志编号的行将在同一列中标记为"后退"。但是,最困难的情况是:如果此标志编号在每列中出现多次,则"From_num"和"To_num"之间的行将在新列中标记为"前进和后退",请参阅下面的 df 和预期结果。

Expected result
id From_num  To_num     Direction
0   AA       80      99       Forward
1   AA       68      80       Forward
2   AA      751      68          Back
3   AA   Issued     751          Back
4   BB       32     105       Forward
5   BB       68      32       Forward
6   BB      126      68          Back
7   BB   Issued     126          Back
8   BB   Missed      49          Back
9   CC      105     324       Forward
10  CC       68     105       Forward 
11  CC      114      68  Forward&Back # From line 11 to 13, flag # 68 appears more than once
12  CC       76     114  Forward&Back # so the line 11, 12 and 13 labelled "Forward&Back"
13  CC       68      76  Forward&Back 
14  CC       99      68          Back 
15  CC   Missed      99          Back

我尝试编写许多循环,但它们都失败了,无法产生预期的结果。因此,如果有人有想法,请提供帮助。希望这个问题是清楚的。非常感谢!

我没有"真正的循环"。

  1. 保留行号 (reset_index()(
  2. 构造包含标志的记录的新数据框 (68(
  3. ">
  4. 前进"和"后退"的简单逻辑基于第一次看到 68 之前或之后的行
  5. "前进和后退"发生在多次目击事件以及第 2 次和第 (n-1( 次目击事件之间
def direction(r):
flagrow = df2[(df2["id"]==r["id"]) ]["index"].values
if r["index"] <= flagrow[0]: val = "Forward"
elif r["index"] > flagrow[0]: val = "Back"
if len(flagrow)>2 and r["index"] >= flagrow[1] and r["index"]<flagrow[-1]: val = "Forward&Back"
return val
df = pd.DataFrame({
'id': ['AA', 'AA', 'AA', 'AA', 'BB', 'BB', 'BB', 'BB', 'BB',
'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC'],
'From_num': [80, 68, 751, 'Issued', 32, 68, 126, 'Issued', 'Missed', 105, 68, 114, 76, 68, 99, 'Missed'],
'To_num':[99, 80, 68, 751, 105, 32, 68, 126, 49, 324, 105, 68, 114, 76, 68, 99],
})
df = df.reset_index()
df2 = df[(df.From_num==68) | (df.To_num==68)].copy()
df["Direction"] = df.apply(lambda r: direction(r), axis=1)
df

最新更新