替换组的变量量

  • 本文关键字:变量 替换 python regex
  • 更新时间 :
  • 英文 :


我有字符串"The dog has 12.345 bones"。我想匹配12.345并用XYZ替换它的.,这样字符串就变成了"The dog has 12XYZ345 bones"。这个数字可以是任何有千个点的有效数字,所以1,456,1.00034.234.233。例如,100.00是无效的。我该怎么做呢?

对于internet地址我使用

address_pattern = r"(www).([A-Za-z0-9]*).(de|com|org)"
re.sub(address_pattern, r"XYZ2XYZ3", text)

但问题是,数字可以想多长就多长,我没有确切的分组数量来替换。

使用

import re
regex = r"(?<!S)d{1,3}(?:.d{3})*(?!S)"
test_str = "The dog has 12.345 bones"
print(re.sub(regex, lambda m: m.group().replace('.','XYZ'), test_str))

结果:The dog has 12XYZ345 bones

参见Python证明。句点被lambda m: m.group().replace('.','XYZ')替换为匹配数字中的句点。

表达式解释

--------------------------------------------------------------------------------
(?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
S                       non-whitespace (all but n, r, t, f,
and " ")
--------------------------------------------------------------------------------
)                        end of look-behind
--------------------------------------------------------------------------------
d{1,3}                  digits (0-9) (between 1 and 3 times
(matching the most amount possible))
--------------------------------------------------------------------------------
(?:                      group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
.                       '.'
--------------------------------------------------------------------------------
d{3}                    digits (0-9) (3 times)
--------------------------------------------------------------------------------
)*                       end of grouping
--------------------------------------------------------------------------------
(?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
S                       non-whitespace (all but n, r, t, f,
and " ")
--------------------------------------------------------------------------------
)                        end of look-ahead

如果您想仅在用作千位分隔符时实际替换.,您可以这样做:

(?:<D|^)d{1,3}(?:.d{3})+(?=[^d.]|$)

演示Python演示:

import re
txt='''
1
456
1.000
34.234.233
100.00
'''
print(
re.sub(r'(?:<D|^)d{1,3}(?:.d{3})+(?=[^d.]|$)', 
lambda m: m.group(0).replace('.', 'XYZ'), 
txt, flags=re.M)
)

打印:

1
456
1XYZ000
34XYZ234XYZ233
100.00

最新更新