经过分组后,从每组中找出最接近的熊猫蟒蛇



我是熊猫的新手,一直在计算。这是我的DF样本。我有兴趣获得每组收盘价格中最接近的三振_价格行。(按日期、时间、选项类型分组(

name      date   time      open      high       low     close       option_type  strike_price  open_opn  high_opn  low_opn  close_opn 
0       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      CE              11800       104.95    109.45   104.50     107.25 
1       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      CE              11700       160.00    164.25   159.90     161.60
2       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      CE              11600       223.20    229.10   223.20     226.30  
0       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      PE              11800       166.05    166.95   163.55     165.95  
1       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      PE              11600       88.80     89.45    86.45      89.00
2       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      PE              11700       122.35    122.65   119.40     122.30

这是我的预期输出。

name      date   time      open      high       low     close       option_type  strike_price  open_opn  high_opn  low_opn  close_opn 
1       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11700.15      CE              11700       160.00    164.25   159.90     161.60
2       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11700.15      PE              11700       122.35    122.65   119.40     122.30

如果关闭程序为11760,则预期o/p为。

0       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      CE              11800       104.95    109.45   104.50     107.25
0       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      PE              11800       166.05    166.95   163.55     165.95

请帮我一下。非常感谢。

首先使用subabs,然后使用sort_valuesdrop_duplicates:获得每组的最小值

df['diff'] = df['close'].sub(df['strike_price']).abs()
df = df.sort_values('diff').drop_duplicates(['date', 'time', 'option_type'])
print (df)
name      date   time      open     high       low     close option_type  
1  NIFTY  20180903  09:16  11736.05  11736.1  11699.35  11720.15          CE   
2  NIFTY  20180903  09:16  11736.05  11736.1  11699.35  11720.15          PE   
strike_price  open_opn  high_opn  low_opn  close_opn   diff  
1         11700    160.00    164.25    159.9      161.6  20.15  
2         11700    122.35    122.65    119.4      122.3  20.15  

DataFrameGroupBy.idxminloc的另一种解决方案:

df = df.reset_index(drop=True)
df['diff'] = df['close'].sub(df['strike_price']).abs()
df = df.loc[df.groupby(['date', 'time', 'option_type'])['diff'].idxmin()]
print (df)
name      date   time      open     high       low     close option_type  
1  NIFTY  20180903  09:16  11736.05  11736.1  11699.35  11720.15          CE   
5  NIFTY  20180903  09:16  11736.05  11736.1  11699.35  11720.15          PE   
strike_price  open_opn  high_opn  low_opn  close_opn   diff  
1         11700    160.00    164.25    159.9      161.6  20.15  
5         11700    122.35    122.65    119.4      122.3  20.15    

最新更新