仅当次要值不为空或nan时才更新主要值



我有这段代码,我想确保第一个变量(df['Given Name](的值只有在第二个变量(df['Maiden Name](不为空或不为nan时才会更新。

df['Given Name'] = df['Given Name'] + ' ' + df['Maiden Name']

实现这一目标的最短方法是什么?

只有当与Series.notna&的条件匹配时,才可以通过比较不相等的空值进行逐位AND,从而添加值:

df = pd.DataFrame({'Given Name':['a','b','c'],
'Maiden Name':['d',np.nan,'']})
m = df['Maiden Name'].notna() & df['Maiden Name'].ne('')
print (m)
0     True
1    False
2    False
Name: Maiden Name, dtype: bool
df.loc[m, 'Given Name'] += ' ' + df.loc[m, 'Maiden Name']
print (df)
Given Name Maiden Name
0        a d           d
1          b         NaN
2          c            

一个选项可以是(假设Maiden Name中没有周围空间或它们不重要(:

df['Given Name'] = df['Given Name'] + (' ' + df['Maiden Name']).str.strip()

使用:

m = df['Maiden Name'].notna() | df['Maiden Name'].str.len().gt(0)
df['Given Name'] = np.where(m, df['Given Name'] + ' ' + df['Maiden Name'], df['Given Name'])

最新更新