我有下面的数据框架,我的目标是了解股票是否在一个时期一直持有。为此,我基于ticker
的级联和date
的字符串转换创建了两个查找代码str_previous_previous_period_code
和str_current_period_code
。
我需要一个新的列来返回一个布尔值1
或0
,如果它在前一个周期中保留的话。所以逻辑是:
- 查找
str_previous_period_code
- 如果在数据帧中找到,则为
df['value'] = 1
,否则为df['value'] = 0
我已经尝试过.relookup((来启动逻辑,如下所示:
df['value'] = df.lookup(df['str_previous_period_code'], df['str_current_period_code'])
然而,我得到以下关键错误:
KeyError: 'CXP2001-04-27'
ticker date close next_period_close NATR score return str_previous_period_code str_current_period_code
0 CXP 2001-04-27 4.615000 4.585000 3.700552 9 -0.006501 CXP2001-04-20 CXP2001-04-27
1 TOL 2001-04-27 1.851068 1.862219 3.174988 9 0.006024 TOL2001-04-20 TOL2001-04-27
2 WOW 2001-04-27 8.832543 8.941464 2.560720 9 0.012332 WOW2001-04-20 WOW2001-04-27
3 WES 2001-04-27 13.205642 12.771989 2.448139 9 -0.032839 WES2001-04-20 WES2001-04-27
4 PPT 2001-04-27 40.000000 40.400000 2.364224 9 0.010000 PPT2001-04-20 PPT2001-04-27
5 FLT 2001-04-27 23.398888 23.309237 2.281367 9 -0.003831 FLT2001-04-20 FLT2001-04-27
6 MIM 2001-04-27 1.260000 1.380000 5.696656 8 0.095238 MIM2001-04-20 MIM2001-04-27
7 ALL 2001-04-27 6.386961 6.113234 5.476623 8 -0.042857 ALL2001-04-20 ALL2001-04-27
8 CXP 2001-05-04 4.585000 4.650000 3.685788 9 0.014177 CXP2001-04-27 CXP2001-05-04
9 TOL 2001-05-04 1.862219 1.866679 3.139378 9 0.002395 TOL2001-04-27 TOL2001-05-04
10 WES 2001-05-04 12.771989 13.321481 2.572519 9 0.043023 WES2001-04-27 WES2001-05-04
11 WOW 2001-05-04 8.941464 9.456366 2.552963 9 0.057586 WOW2001-04-27 WOW2001-05-04
12 PPT 2001-05-04 40.400000 39.991000 2.313191 9 -0.010124 PPT2001-04-27 PPT2001-05-04
13 FLT 2001-05-04 23.309237 23.194881 2.262463 9 -0.004906 FLT2001-04-27 FLT2001-05-04
14 ALL 2001-05-04 6.113234 6.200552 5.699601 8 0.014283 ALL2001-04-27 ALL2001-05-04
15 MIM 2001-05-04 1.380000 1.340000 5.289190 8 -0.028986 MIM2001-04-27 MIM2001-05-04
我想您可以使用以下任一项进行查找:
- 用
df['str_current_period_code']
得到的Series
的map
方法:
# to avoid calling the tolist method on each iteration:
previous_period_code = df['str_previous_period_code'].tolist()
# fill the 'value' column according to your logic :
df['value'] = df['str_current_period_code'].apply(
lambda x: 1 if x in previous_period_code else 0)
df['str_current_period_code']
Series
的isin
方法,但结果将为True
/False
,而不是您所要求的1和0(此方法可能比第一种方法更快(:
df['value'] = df['str_current_period_code'].isin(df['str_previous_period_code'])