我有一个数据帧,我想在其中放置过滤器或条件,特别是对于两列。如果值没有通过阈值,请更改值,将其更改为零。我知道我可以通过转换到单独的数据帧来完成过滤和合并。如果有其他有效的方法,请建议我。
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 1.0, 8.0, 2, 7.5, 8.0],
"Amount2": [1, 5.0, 8.0, 10.5, 0, 8.0]})
我想要的输出>2阈值
User Amount Amount2
user1 10.0 0.0
user2 0.0 5.0
user2 8.0 8.0
user3 0.0 10.5
user2 7.5 0.0
user1 8.0 8.0
您可以将2
到2
以下的clip
值替换为2
到0
df[['Amount', 'Amount2']] = df[['Amount', 'Amount2']].clip(lower=2).replace(2, 0)
print(df)
User Amount Amount2
0 user1 10.0 0.0
1 user2 0.0 5.0
2 user2 8.0 8.0
3 user3 0.0 10.5
4 user2 7.5 0.0
5 user1 8.0 8.0
您可以使用numpy.where
一次处理所有需要的列:
# select desired columns (here based on name)
cols = df.filter(like='Amount').columns
# it's also possible to manually set them
# cols = ['Amount', 'Amount2']
df[cols] = np.where(df[cols].le(2), 0, df[cols]) # or .lt(2) for <
更新的df
:
User Amount Amount2
0 user1 10.0 0.0
1 user2 0.0 5.0
2 user2 8.0 8.0
3 user3 0.0 10.5
4 user2 7.5 0.0
5 user1 8.0 8.0
threshold = 2
df.loc[(df['Amount'] < threshold),'Amount'] = 0
df.loc[(df['Amount2'] < threshold),'Amount2'] = 0
您可以使用np.where:
import numpy as np
df['Amount'] = np.where(df['Amount'] < 2,0, df['Amount'])
df['Amount2'] = np.where(df['Amount2'] < 2,0, df['Amount2'])
或者,如果您的数据帧中只有以下列:
df = df.where(df < 2, 0)