我有一个包含4列的df:ID
、Value
、UpperBound
和LowerBound
,如下所示:
ID | Value | UpperBound | LowerBound
1 23 NaN NaN
2 55 NaN NaN
3 87 NaN NaN
4 99 NaN NaN
5 NaN 50 5
6 NaN 5 1
7 NaN 95 50
8 NaN 99 95
我想根据各自的边界将Value
列中的值转换为UpperBound
和LowerBound
(例如,23适合50-5(。所以输出看起来像这样:
ID | Value | UpperBound | LowerBound
1 NaN 50 5
2 NaN 95 50
3 NaN 95 50
4 NaN 99 95
5 NaN 50 5
6 NaN 5 1
7 NaN 95 50
8 NaN 99 95
Value列最终将只包含NaN值,因此我稍后可以消除它。UpperBound
和LowerBound
如下:
1-5
5-50
50-95
95-99
有人能帮我创建必要的代码吗?
提前非常感谢!!
假设原始数据帧如下所示:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Value': [23,55,87,99,999],
'UpperBound': [np.nan, np.nan, np.nan, np.nan, np.nan],
'LowerBound': [np.nan, np.nan, np.nan, np.nan, np.nan] })
print(df)
Value UpperBound LowerBound
0 23 NaN NaN
1 55 NaN NaN
2 87 NaN NaN
3 99 NaN NaN
4 999 NaN NaN
这里有一种方法,假设包含下限,排除上限:
def check_bound(x):
for (a,b) in [(1,5), (5,50), (50,95), (95,100)]:
if x in np.arange(a,b):
return [a,b] #this will exit the function
return [np.nan, np.nan] #iteration ended means bound is not found
df[['LowerBound', 'UpperBound']] = df['Value'].apply(lambda x: check_bound(x)).tolist()
print(df)
Value UpperBound LowerBound
0 23 50.0 5.0
1 55 95.0 50.0
2 87 95.0 50.0
3 99 100.0 95.0
4 999 NaN NaN