小贝子编程

使用panda替换列中除第一个重复字符串外的所有出现的字符串

本文关键字：字符串替换 panda 第一个使用 python python-3.x pandas dataframe
更新时间 : 2023-09-20
英文 : Replace all occurrences but first of a repeating string in a column using pandas

我在pandas数据帧中有一列，其中有时有一个重复字符串：

见好晚好

col1	col2
1你好
2
3
4	上午
5
6

您可以找到包含"hello"的行的索引，然后使用pandas.DataFrame.loc:修改除第一次出现之外的所有行

In [1]: import pandas as pd
In [2]: df = pd.DataFrame(data={'col1': [1, 2, 3, 4, 5, 6],
...:                         'col2': ['hello', 'bye', 'hello', 'morning', 'night', 'hello']})
In [3]: df
Out[3]: 
col1     col2
0     1    hello
1     2      bye
2     3    hello
3     4  morning
4     5    night
5     6    hello
In [4]: hello_indices = df.index[df['col2'] == 'hello']
In [5]: hello_indices
Out[5]: Int64Index([0, 2, 5], dtype='int64')
In [6]: df.loc[hello_indices[1:],'col2'] = 'hello again'
In [7]: df
Out[7]: 
col1         col2
0     1        hello
1     2          bye
2     3  hello again
3     4      morning
4     5        night
5     6  hello again

您可以使用

df['col2'] = df['col2'].str.split(expand=True)[0]

默认情况下，split()在空格处拆分，expand=true会创建两个变量，而不是两个列表

使用panda替换列中除第一个重复字符串外的所有出现的字符串

相关内容

最新更新

热门标签：