由于我不擅长正则表达式,请帮助我定义方法。我有一个将字符串作为参数并返回列表的方法
1. get_values("IN BETWEEN 30 AND 35") => [30,31,32,33,34,35]
2. get_values("(in between 35 and 40) and (in [56,57,58])") => [35,36,37,38,39,40,56,57,58]
3. get_values("(in between 30 and 35) and (IN BETWEEN 40 AND 45)") =>[30,31,32,33,34,35,40,41,42,43,44,45]
这些及其组合是可能的情况
这是一个潜在的解决方案,但这实际上应该通过使用某种解析器正确解析 SQL 语法来处理:
import re
def get_values(sql):
sql = sql.upper()
between_regex = '(d+)s+ANDs+(d+)'
ranges = [range(int(a), int(b)) for a, b in re.findall(between_regex, sql)]
in_regex = '[(.*)]'
ranges += [[int(y) for y in x.split(',')] for x in re.findall(in_regex, sql)]
return [x for r in ranges for x in r]
print get_values("IN BETWEEN 30 AND 35")
print get_values("(in between 35 and 40) and (in [56,57,58])")
print get_values("(in between 30 and 35) and (IN BETWEEN 40 AND 45)")
#[30, 31, 32, 33, 34]
#[35, 36, 37, 38, 39, 56, 57, 58]
#[30, 31, 32, 33, 34, 40, 41, 42, 43, 44]
试试这个方法:
>>> s = 'in between 30 and 35'
>>> import re
>>> re.findall('\d')
>>> re.findall('\d+', s)
['30', '35']
>>> p = re.findall('\d+', s)
>>> range(p[0], p[1], 1)
>>> range(int(p[0]), int(p[1]), 1)
[30, 31, 32, 33, 34]
>>>
等等...
问我是否需要更多评论
您可以使用正则表达式在字符串中搜索模式,其中下限和上限是第一个和第二个正则表达式组。之后,您只需返回范围。如果还想包括上限,请将1
添加到upper
。
import re
def get_values(myString):
res = []
m = re.search('in betweens+(d+)s+ands+(d+)', myString, re.IGNORECASE)
p = re.search('in [([d, ]*)]', myString, re.IGNORECASE)
if m != None:
lower = int(m.group(1))
upper = int(m.group(2)) # add 1 here to include upper limit
res.extend(range(lower,upper))
if p != None:
res.extend([int(x) for x in p.group(1).split(',')])
# using set here to kill duplicates, you can remove
# it if you want to preserve them
res = list(set(res)) if res != None else []
return res
用法示例:
print get_values("in between 10 and 20 and in [1, 2,3]")
返回:
[1, 2, 3, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]