Pandas查找并返回布尔值



我有下面的数据框架,我的目标是了解股票是否在一个时期一直持有。为此,我基于ticker的级联和date的字符串转换创建了两个查找代码str_previous_previous_period_codestr_current_period_code

我需要一个新的列来返回一个布尔值10,如果它在前一个周期中保留的话。所以逻辑是:

  • 查找str_previous_period_code
  • 如果在数据帧中找到,则为df['value'] = 1,否则为df['value'] = 0

我已经尝试过.relookup((来启动逻辑,如下所示:

df['value'] = df.lookup(df['str_previous_period_code'], df['str_current_period_code'])

然而,我得到以下关键错误:

KeyError: 'CXP2001-04-27' 
    ticker  date    close   next_period_close   NATR    score   return  str_previous_period_code    str_current_period_code
0   CXP 2001-04-27  4.615000    4.585000    3.700552    9   -0.006501   CXP2001-04-20   CXP2001-04-27
1   TOL 2001-04-27  1.851068    1.862219    3.174988    9   0.006024    TOL2001-04-20   TOL2001-04-27
2   WOW 2001-04-27  8.832543    8.941464    2.560720    9   0.012332    WOW2001-04-20   WOW2001-04-27
3   WES 2001-04-27  13.205642   12.771989   2.448139    9   -0.032839   WES2001-04-20   WES2001-04-27
4   PPT 2001-04-27  40.000000   40.400000   2.364224    9   0.010000    PPT2001-04-20   PPT2001-04-27
5   FLT 2001-04-27  23.398888   23.309237   2.281367    9   -0.003831   FLT2001-04-20   FLT2001-04-27
6   MIM 2001-04-27  1.260000    1.380000    5.696656    8   0.095238    MIM2001-04-20   MIM2001-04-27
7   ALL 2001-04-27  6.386961    6.113234    5.476623    8   -0.042857   ALL2001-04-20   ALL2001-04-27
8   CXP 2001-05-04  4.585000    4.650000    3.685788    9   0.014177    CXP2001-04-27   CXP2001-05-04
9   TOL 2001-05-04  1.862219    1.866679    3.139378    9   0.002395    TOL2001-04-27   TOL2001-05-04
10  WES 2001-05-04  12.771989   13.321481   2.572519    9   0.043023    WES2001-04-27   WES2001-05-04
11  WOW 2001-05-04  8.941464    9.456366    2.552963    9   0.057586    WOW2001-04-27   WOW2001-05-04
12  PPT 2001-05-04  40.400000   39.991000   2.313191    9   -0.010124   PPT2001-04-27   PPT2001-05-04
13  FLT 2001-05-04  23.309237   23.194881   2.262463    9   -0.004906   FLT2001-04-27   FLT2001-05-04
14  ALL 2001-05-04  6.113234    6.200552    5.699601    8   0.014283    ALL2001-04-27   ALL2001-05-04
15  MIM 2001-05-04  1.380000    1.340000    5.289190    8   -0.028986   MIM2001-04-27   MIM2001-05-04

我想您可以使用以下任一项进行查找:

  • df['str_current_period_code']得到的Seriesmap方法:
# to avoid calling the tolist method on each iteration:
previous_period_code = df['str_previous_period_code'].tolist()
# fill the 'value' column according to your logic :
df['value'] = df['str_current_period_code'].apply(
    lambda x: 1 if x in previous_period_code else 0)
  • df['str_current_period_code'] Seriesisin方法,但结果将为True/False,而不是您所要求的1和0(此方法可能比第一种方法更快(:
df['value'] = df['str_current_period_code'].isin(df['str_previous_period_code'])

最新更新