我有一个字符串,内容如下:"GAP-88(R 07/17((库存再订购号(";
我想从";G〃;直到";7〃;
我必须用python来做这件事。此外,所需的结果应该是这样的:
"GAP88R0717";
结束字符可以是7、9或4。
使用正则表达式
代码
import re
def extract(text):
'''
extract substring
'''
pattern = r'^G.*[794]' # pattern ends in 7, 9, 4
# Find pattern in text
m = re.match(pattern, text) # find pattern in string
if m:
# Found pattern substring
# Remove space, , (, - by replacing with empty string
m = re.sub(r'[ (/-]', '', m.group())
return m
用法
s = "GAP-88 (R 07/17) (STOCK REORDER NUMBER)"
print(extract(s)) # Output: GAP88R0717
解释
r'^G.*[794]' - regex match pattern
'^G - match strings that begins with G
.*[794] - greedily match all characters
until last 7, 9, or 4
in string
后续问题
检测后的文本单词
代码
def postfix(text):
'''
Extract text after (STOCK REORDER NUMBER)
'''
pattern = r'(STOCK REORDER NUMBER)s(w+)'
m = re.search(pattern, text)
if m:
return m.group(1)
else:
return ''
用法
s = "GAP-88 (R 07/17) (STOCK REORDER NUMBER) abc123"
print(postfix(s)) # Output: abc123
print(extract(s) + ' ' + postfix(s))
# Output: GAP88R0717 abc123
Regex是一个完美的库。这是代码:
import re
def get_alnum(string, start_index, end_index):
alnums = re.findall(r'[A-Za-z0-9]', string[start_index:end_index+1])
return "".join(alnums)
用法:
st = 'GAP-88 (R 07/17) (STOCK REORDER NUMBER)'
print(get_alnums(st, 0, 14)) #0 is index of 'G' and 14 is index of '7' in string