小贝子编程

尽管使用regex，但无法删除字符串中带有重音的特殊字符

本文关键字：字符串特殊字符删除 regex python regex python-3.x string special-characters
更新时间 : 2023-09-18
英文 : Unable to remove accented special characters in a string despite using regex

我有以下代码

import re
oldstr="HRÂ Director,Â LearningÂ"
newstr = re.sub(r"[-()"#/@;:<>{}`+=&~|.!?,^]", " ", oldstr)
print(newstr)

上面的代码不起作用。

当前结果"人力资源总监，学习">

预期结果"人力资源总监，学习">

如何做到这一点？

将我的评论转换为答案，以便为未来的访问者轻松找到解决方案。

您可以使用：

import re
oldstr="HRÂ Director,Â LearningÂ"
newstr = re.sub(r'[^x00-x7f]+|[-()"#/@;:<>{}`+=&~|.!?,^]+', "", oldstr)
print(newstr)

输出：

HR Director Learning

[^x00-x7f]将匹配所有非ASCII字符。

您也可以使用此方法：

def _removeNonAscii(s): 
return "".join(i for i in s if ord(i)<128)

以下是我的代码输出方式：

s = "HRÂ Director,Â LearningÂ"
def _removeNonAscii(s): 
return "".join(i for i in s if ord(i)<128)
print(_removeNonAscii(s))

输出：

人力资源总监，学习

相关内容