Pandas-减去两列时忽略空白字符串



我正在尝试减去存储为字符串的两列(Price1&Price2(。

我有两个数据集,"data"有空字符串,"data2"在价格列中没有空字符串。代码在data2上运行良好,但我正试图让它在常规的"数据"集上运行。

下面的行当前不起作用。我试图让它只在Price1&Price2不是空白字符串。下面是带有示例日期的完整代码。

df['PriceDiff']     = (df['Price1'].astype(float) - df['Price2'].astype(float)).where((~df['Price1'].eq('')) & (~df['Price2'].eq('')), 'Check for Blank Strings in Price Values')
import pandas as pd
data = {
'Product': ['G_FIXED', 'G_FIXED', 'P_FIXED', 'P_FIXED', 'G_FIXED', 'G_FIXED', 'O_FIXED', 'O_FIXED', 'O_FIXED', 'O_FIXED', 'O_FIXED', 'O_FIXED' ],
'Price1': ['6.10', '5.70', '6.10', '', '5.10', '4.70', '5.10', '4.70', '10.10', '', '10.10', '3.70'],
'Price2': ['6.10', '5.70', '', '5.70', '', '4.70', '', '4.70', '9.10', '', '', '8.70']
}
data2 = {
'Product': ['G_FIXED', 'G_FIXED', 'P_FIXED', 'P_FIXED', 'G_FIXED', 'G_FIXED', 'O_FIXED', 'O_FIXED', 'O_FIXED', 'O_FIXED', 'O_FIXED', 'O_FIXED' ],
'Price1': ['6.10', '5.70', '6.10', '2.35', '5.10', '4.70', '5.10', '4.70', '10.10', '1.00', '10.10', '3.70'],
'Price2': ['6.10', '2.70', '6.10', '2.35', '8.10', '4.70', '5.10', '4.70', '13.10', '1.00', '10.10', '3.70'],
}
df = pd.DataFrame(data)
df2 = pd.DataFrame(data2)
df2['PriceDiff2']   = df2['Price1'].astype(float) - df2['Price2'].astype(float)
df2['PriceDiff2-2'] = (df2['Price1'].astype(float) - df2['Price2'].astype(float)).where((~df2['Price1'].eq('')) & (~df2['Price2'].eq('')), 'Check for Blank Strings in Price Values')
df['PriceDiff']     = (df['Price1'].astype(float) - df['Price2'].astype(float)).where((~df['Price1'].eq('')) & (~df['Price2'].eq('')), 'Check for Blank Strings in Price Values')
print(df)
print()
print(df2)

使用布尔掩码保留正确的行:

# Find rows where it's possible to compute the diff
m = df[['Price1', 'Price2']].ne('').all(axis=1)
# Compute the diff
df['PriceDiff'] = df.loc[m, ['Price1', 'Price2']].astype(float) 
.agg('diff', axis=1)['Price2'].reindex(df.index, fill_value='')

输出:

>>> df
Product Price1 Price2 PriceDiff
0   G_FIXED   6.10   6.10       0.0
1   G_FIXED   5.70   5.70       0.0
2   P_FIXED   6.10                 
3   P_FIXED          5.70          
4   G_FIXED   5.10                 
5   G_FIXED   4.70   4.70       0.0
6   O_FIXED   5.10                 
7   O_FIXED   4.70   4.70       0.0
8   O_FIXED  10.10   9.10      -1.0
9   O_FIXED                        
10  O_FIXED  10.10                 
11  O_FIXED   3.70   8.70       5.0

最新更新