只保留熊猫中具有特定十进制值的行



目前脑子放屁,我不记得如何根据数字结尾的小数过滤掉数字。

假设我的数据框是——

dic = {'product':['Bread','Milk','Eggs','Water','OJ','Cereal','Coffee',
'Apples','Banana','Muffin'],
'price':[3.89,2.99,4.00,0.69,1.99,2.39,5.00,0.99,0.98,1.50]}
df = pd.DataFrame(dic)
print(df)

带输出-

product  price
0   Bread   3.89
1    Milk   2.99
2    Eggs   4.00
3   Water   0.69
4      OJ   1.99
5  Cereal   2.39
6  Coffee   5.00
7  Apples   0.99
8  Banana   0.98
9  Muffin   1.50

我只想保持价格以 .99、.00 和 .50 结尾

我想要的输出是-

product  price
1    Milk   2.99
2    Eggs   4.00
4      OJ   1.99
6  Coffee   5.00
7  Apples   0.99
9  Muffin   1.50

应该知道如何做到这一点,只是目前不记得了。

如果这些是简单的货币(美元(金额,您可以将十进制值转换为整数(为避免浮动比较,这些值可能会导致答案不正确(,然后进行isin检查:

df[df['price'].mul(100).mod(100).astype(int).isin([0, 50, 99])]
product  price
1    Milk   2.99
2    Eggs   4.00
4      OJ   1.99
6  Coffee   5.00
7  Apples   0.99
9  Muffin   1.50

根据我的测试,这是两者中更快的。


np.isclose的另一个选项:

df[np.logical_or.reduce([
np.isclose(df['price'].mod(1), d) for d in [0, .99, .5]])]
product  price
1    Milk   2.99
2    Eggs   4.00
4      OJ   1.99
6  Coffee   5.00
7  Apples   0.99
9  Muffin   1.50

你可以这样做:

dic = {'product':['Bread','Milk','Eggs','Water','OJ','Cereal','Coffee','Apples','Banana','Muffin'],
'price':[3.89,2.99,4.00,0.69,1.99,2.39,5.00,0.99,0.98,1.50]}

for price in dic['price']:
if str(price).split('.')[1] not in ['99','5'] and int(price)!=price:
dic['product'].pop(dic['price'].index(price)) # Remove the product that aligns with the unwanted price
dic['price'].remove(price) # Remove the price
print(dic)

输出:

{'product': ['Milk', 'Eggs', 'OJ', 'Coffee', 'Apples', 'Muffin'],
'price': [2.99, 4.0, 1.99, 5.0, 0.99, 1.5]}

最新更新