Add New Column,返回另一列中唯一值的最小值



我正试图创建一个名为"jolly"的新列,该列将为RH_RNo中的每个唯一值填充HIR_ValuePrice列中的最小值。

这是我目前的尝试:

def evepricefav(races):
for race in races:
clv.loc[clv['RH_RNo'] == race]['HIR_EveningPrice']

clv['jolly'] = clv.apply(evepricefav, axis=1)

以下是数据帧的示例。你可以看到,一次失败的尝试在快乐栏中填充了1.4。

RH_RNo   HIR_BSP   HIR_EveningPrice  value    jolly
794565  189631    28.75              26.0 -0.269565    1.4
794566  189631    15.38              13.0 -0.414824    1.4
794567  189631    15.00               6.0 -0.533333    1.4
794568  189631     4.80               5.0  0.458333    1.4
794569  189631     9.85              13.0  0.522843    1.4
794570  189631     4.30               9.0  0.627907    1.4
794571  189631     5.45               6.0  0.467890    1.4
794572  189631    34.00              17.0 -0.500000    1.4
794573  189631    13.00              11.0 -0.153846    1.4
794574  189634    31.77               9.0 -0.527856    1.4
794575  189634    60.00              26.0 -0.433333    1.4
794576  189634    13.50              17.0  0.925926    1.4
794577  189634     9.20              11.0 -0.130435    1.4
794578  189634     9.80               8.0 -0.081633    1.4
794579  189634    10.00              17.0  0.700000    1.4
794580  189634    11.79              17.0  0.102629    1.4
794581  189634    29.60              21.0  0.148649    1.4
794582  189634     2.99               3.5  0.337793    1.4
794583  189634     8.48               6.0 -0.292453    1.4
794584  189637    18.24              11.0 -0.396930    1.4

您可以根据RH_RNo列进行分组,然后在'HIR_EveningPrice':上使用.transform('min')

df['jolly'] = df.groupby('RH_RNo')['HIR_EveningPrice'].transform('min')
print(df)

打印:

id  RH_RNo  HIR_BSP  HIR_EveningPrice     value  jolly
0   794565  189631    28.75              26.0 -0.269565    5.0
1   794566  189631    15.38              13.0 -0.414824    5.0
2   794567  189631    15.00               6.0 -0.533333    5.0
3   794568  189631     4.80               5.0  0.458333    5.0
4   794569  189631     9.85              13.0  0.522843    5.0
5   794570  189631     4.30               9.0  0.627907    5.0
6   794571  189631     5.45               6.0  0.467890    5.0
7   794572  189631    34.00              17.0 -0.500000    5.0
8   794573  189631    13.00              11.0 -0.153846    5.0
9   794574  189634    31.77               9.0 -0.527856    3.5
10  794575  189634    60.00              26.0 -0.433333    3.5
11  794576  189634    13.50              17.0  0.925926    3.5
12  794577  189634     9.20              11.0 -0.130435    3.5
13  794578  189634     9.80               8.0 -0.081633    3.5
14  794579  189634    10.00              17.0  0.700000    3.5
15  794580  189634    11.79              17.0  0.102629    3.5
16  794581  189634    29.60              21.0  0.148649    3.5
17  794582  189634     2.99               3.5  0.337793    3.5
18  794583  189634     8.48               6.0 -0.292453    3.5
19  794584  189637    18.24              11.0 -0.396930   11.0

使用以下代码

data1 = [1,1,2,1,2]
data2 = [7,2,8,1,3]
import pandas as pd 
df = pd.DataFrame(columns=["a","b"])
df['a'] = data1
df['b'] = data2
dfc = df.groupby('a')['b']
df = df.assign(jolly=dfc.transform(max))
print(df)

当然,在那里设置你的var名称:(

样本数据的输出:

a  b  jolly
0  1  7    7
1  1  2    7
2  2  8    8
3  1  1    7
4  2  3    8

最新更新