我有这个多行文本:
1. fef w fwe fwe
fewfa 2. fwa f
fwefwfw gw
2 2f 23. f
g gegwg
32. gre34 g3 1. gr
egsg
我想在行开头使用该号码作为密钥(使用 .
或作为分离char(。
由此结果必须是:
{
"1": "fef w fwe fwe fewfa 2. fwa f fwefwfw gw",
"2": "2f 23. f g gegwg",
"32": "gre34 g3 1. gr egsg"
}
您可以使用此正则:
/^(d+).?s+(.*?)(?=(?:^d+.?)|Z)/gms
^ assert start of line
^ capture 1 or more digits
^ optional literal .
^ one or more spaces
^ every character including n
^ lookahead to next block start or end
^ flags M for multiline and S to have
dot match all
演示
然后您可以像这样创建dict:
>>> dict(re.findall(r'^(d+).?s+(.*?)(?=(?:^d+.?)|Z)', s, re.M|re.S))
{'1': 'fef w fwe fwenfewfa 2. fwa fnfwefwfw gwn', '32': 'gre34 g3 1. grnegsg', '2': '2f 23. fng gegwgn'}