如何替换字符串将结尾替换为.(周期)

我正在尝试从字符串中替换字符串rs.

df['Purpose'] = df['Purpose'].str.replace('rs.','')
+-------+----------+--------+
| Input | Expected | Output |
+-------+----------+--------+
| rs.22 | 22       | 22     |
+-------+----------+--------+
| rs32  | rs32     | 2      |
+-------+----------+--------+

测试代码：

x = pd.DataFrame(['rs.22', 'rs32'], columns=['Purpose'])
x['Purpose'] = x['Purpose'].str.replace('rs.','')
print('x mod', x)

这会产生以下输出：

x mod   Purpose
0      22
1       2

PS：使用正则表达式[-+]?[.]?[d]+(?:,ddd)*[.]?d*(?:[eE][-+]?d+)?仅提取数字的方法无法区分rs.3.5和3.5，但输出为3.5

通常，replace以正则表达式模式运行。你有两个简单的选择来绕过它。@101建议的首选选项是关闭regex:

df['Purpose'] = df['Purpose'].str.replace('rs.', '', regex=False)

另一种选择是转义句点，使其与实际句点而不是任何字符相匹配。这是在0.23.0之前的panda版本中使用的选项，当引入regex参数时：

df['Purpose'] = df['Purpose'].str.replace(r'rs.', '')

Regex匹配通常比简单的字符串比较慢，因此第一个选项的性能会更高。

在regex中，句点'.'几乎匹配所有字符。要匹配文字句点，请使用前面的反斜杠对其进行转义：

x['Purpose'] = x['Purpose'].str.replace('rs.','')

请参阅regex如何：https://docs.python.org/3/howto/regex.html

这是正确的一个，你需要使用st替换pandas有自己的替换功能：-

>>> df
Input
0  rs.22
1  rs321
>>> df['Input'].replace("rs.","",regex=True)
0       22
1    rs321
Name: Input, dtype: object
>>>

基本上问题是pandas.Series.str.replace()默认具有regex=True，因此它假设传入的模式是正则表达式。

您可以使用：

x['Purpose'] = x['Purpose'].str.replace('rs.', '', regex=False)

相关内容

最新更新

热门标签：