如果给定了起点和终点,如何从句子中提取字母数字字符串



我有一个字符串,内容如下:"GAP-88(R 07/17((库存再订购号(";

我想从";G〃;直到";7〃;

我必须用python来做这件事。此外,所需的结果应该是这样的:

"GAP88R0717";

结束字符可以是7、9或4。

使用正则表达式

代码

import re
def extract(text):
'''
extract substring
'''
pattern = r'^G.*[794]'  # pattern ends in 7, 9, 4

# Find pattern in text
m = re.match(pattern, text)       # find pattern in string
if m:
# Found pattern substring
# Remove space, , (, - by replacing with empty string
m = re.sub(r'[ (/-]', '', m.group())
return m

用法

s = "GAP-88 (R 07/17) (STOCK REORDER NUMBER)"
print(extract(s))   # Output: GAP88R0717

解释

r'^G.*[794]'          - regex match pattern
'^G                   - match strings that begins with G
.*[794]               - greedily match all characters 
until last 7, 9, or 4 
in string

后续问题

检测后的文本单词

代码

def postfix(text):
'''
Extract text after (STOCK REORDER NUMBER)
'''
pattern = r'(STOCK REORDER NUMBER)s(w+)'
m = re.search(pattern, text)
if m:
return m.group(1)
else:
return ''

用法

s = "GAP-88 (R 07/17) (STOCK REORDER NUMBER) abc123"
print(postfix(s)) # Output: abc123
print(extract(s) + ' ' + postfix(s))
# Output: GAP88R0717 abc123

Regex是一个完美的库。这是代码:

import re
def get_alnum(string, start_index, end_index):
alnums = re.findall(r'[A-Za-z0-9]', string[start_index:end_index+1])
return "".join(alnums)

用法:

st = 'GAP-88 (R 07/17) (STOCK REORDER NUMBER)'
print(get_alnums(st, 0, 14)) #0 is index of 'G' and 14 is index of '7' in string

最新更新