基于创建的随机变量从数据帧中获取值



我有一个名为rate_multiplier的特定值的数据帧,我需要获取它并将其与我从AWS s3 bucket中获得的rate_multippier进行比较。

为了获取数据帧中的rate_multiplier,我需要获取我为";人;并将它们与基于这些特定特征给出特定速率_

例如:

创建的随机变量:

Life_term = 25
Gender = F
Rate_class = Best-AB
Age = 27
Coverage = 2310

数据帧:

Life_term     Benefit_Length_months  Gender     Rate_class            Age   State    Coverage band (low end)    Coverage band (high end)    Coverage Band   Rate_multiplier
0       15            180                        M      Best-AA               18    Default  500                        1199                        500-1199        2.31
1       15            180                        M      Best-AA               19    Default  500                        1199                        500-1199        2.21
2       15            180                        M      Best-AA               20    Default  500                        1199                        500-1199        2.11
3       15            180                        M      Best-AA               21    Default  500                        1199                        500-1199        2.03
4       15            180                        M      Best-AA               22    Default  500                        1199                        500-1199        1.95
... ... ... ... ... ... ... ... ... ... ...
34987   10            120                        F      Nicotine Average-CD   61    Default  3600                       10000                       3600+           19.10
34988   10            120                        F      Nicotine Average-CD   62    Default  3600                       10000                       3600+           21.27
34989   10            120                        F      Nicotine Average-CD   63    Default  3600                       10000                       3600+           23.44 
34990   10            120                        F      Nicotine Average-CD   64    Default  3600                       10000                       3600+           25.61
34991   10            120                        F      Nicotine Average-CD   65    Default  3600                       10000                       3600+           27.78

因此,对于这个例子,我随机生成的人会得到一个rate_multiplier:

0.93

我的代码如下:

rate_mult_df.loc[(rate_mult_df['Life_term'] == 15) & (rate_mult_df['Gender'] == 'F') & (rate_mult_df['Rate_class'] == 'Best-AB') & (rate_mult_df['Age'] == 27) & (rate_mult_df['Coverage band (low end)'] <= 2310) & (rate_mult_df['Coverage band (high end)'] >= 2310)]

为随机生成的人获取rate_muliplier的正确方法是什么,还是有更简单的方法?感谢您的任何帮助。如果我的问题足够清楚,请告诉我。每天都在努力。

出于性能原因,我会使用.query((

rate_multiplier = df.query(
"Life_term == 15 &"
" Gender == 'F' &"
" Rate_class == 'Best-AB' &"
" Age == 27 &"
" `Coverage band (low end)` == 2310 &"
" `Coverage band (high end)` == 2310"
)["Rate_multiplier"].squeeze()

"更容易";取决于您的工作流程。例如,如果你想从字典中查询,你可以使用:

def get_rate_multiplier(search_params: dict) -> str:
return " and ".join(
[f"({k} == '{v}')" if type(v) == str else f"({k} == {v})" for k, v in search_params.items()]
)

random_person = {
"Life_term": 15, "Gender": "F", "Rate_class": "Best-AB",
"Age": 27, "Coverage band (low end)": 2310, "Coverage band (high end)": 2310
}
rate_multiplier = float(df.query(get_rate_multiplier(random_person))["Rate_multiplier"].squeeze())

最新更新