Pandas-在特定条件下替换另一列中的值



我的DataFrame中有两列。如果第一列中的文本是第二列中的子字符串,我希望将第一列的值替换为第二列。

示例:

Input: 
col1       col2
-----------------
text1      text1 and text2
some text  some other text
text 3     
text 4     this is text 4
Output:
col1                 col2
------------------------------
text1 and text2      text1 and text2
some text            some other text
text 3     
this is text 4       this is text 4

如您所见,我已经替换了第1行和第4行,因为第1行中的文本第1列是第2列的子字符串。

我怎样才能在熊猫身上做这个手术?

尝试df.applyaxis=1

因此,这将遍历每一行,并检查col1是否是col2的子字符串
如果是,则返回col2,否则返回col1

df['col1'] = df.apply(lambda row: row['col2'] if row['col1'] in row['col2'] else row['col1'], axis=1)

完整代码:

df = pd.DataFrame({'col1': ['text1', 'some text', 'text 3', 'text 4'], 'col2': ['text1 and text2', 'some other text', '', 'this is text 4']})
df['new_col1'] = df.apply(lambda row: row['col2'] if row['col1'] in row['col2'] else row['col1'], axis=1)
df
col1    col2             new_col1
0   text1       text1 and text2  text1 and text2
1   some text   some other text  some text
2   text 3                       text 3
3   text 4      this is text 4   this is text 4

通过zip:的NaN安全python选项

import numpy as np
import pandas as pd
df = pd.DataFrame({
'col1': {0: 'text1', 1: 'some text', 2: 'text 3 ', 3: 'text 4'},
'col2': {0: 'text1 and text2', 1: 'some other text', 2: np.nan,
3: 'this is text 4'}
})
df['col1'] = [b if isinstance(b, str) and a in b else a
for a, b in zip(df['col1'], df['col2'])]

通过fillna+apply:的NaN安全熊猫选项

import numpy as np
import pandas as pd
df = pd.DataFrame({
'col1': {0: 'text1', 1: 'some text', 2: 'text 3 ', 3: 'text 4'},
'col2': {0: 'text1 and text2', 1: 'some other text', 2: np.nan,
3: 'this is text 4'}
})
df['col1'] = df.fillna('').apply(
lambda x: x['col2'] if x['col1'] in x['col2'] else x['col1'],
axis=1
)

通过布尔索引isna+loc:的另一个选项

m = ~df['col2'].isna()
df.loc[m, 'col1'] = df[m].apply(
lambda x: x['col2'] if x['col1'] in x['col2'] else x['col1'],
axis=1
)

df:

col1             col2
0  text1 and text2  text1 and text2
1        some text  some other text
2          text 3               NaN
3   this is text 4   this is text 4

最新更新