用python对正则表达式进行分组



我有熊猫系列,看起来像:

m = pd.Series(['expected != is --> found missing lices ## expected: 2.25 || is: 4.5 || expected: 3 || is: 2 ##','expected != is --> found missing lices ## expected: 3.35 || is: 5.5 || expected: 3 || is: 3 ##',
'expected != is --> found missing lices ## expected: 2.25 || is: 4.5 || expected: 3 || is: 2 ##'])

我想做的是用

替换这个系列的每个元素
'expected != is --> found missing lices'

我使用:

m = m.replace('expected != is --> found missing lices ## expected: {[0-9]d*(.d+)?} || is: {[0-9]d*(.d+)?} || expected: {[0-9]d*} || is: {[0-9]d*} ##','expected != is --> found missing lices')

然而,我没有得到正确的结果。我是使用正则表达式的新手,如果有人能解释一下哪一部分定义错了,我会很高兴。

可以使用

m = m.replace(r'expected != is --> found missing lices ## expected: d+(?:.d+)? || is: [0-9]d*(.d+)? || expected: d+ || is: d+ ##', 'expected != is --> found missing lices', regex=True)

查看regex演示

注意:

  • {...}不是regexp中的分组结构,您需要(...)来分组和捕获,或(?:...)来组模式,但在您的情况下,您只是不需要它
  • |字符是特殊的,需要转义
  • [0-9]d*基本上是d+,一个或多个数字。

最新更新