Python -如果字符串元素本身包含空间,如何通过空格分割?



我有一个行:

/home/Plugins/file1 e:222 k:dir (327/1)
/home/Plugins/file2 e:100 k:dir (326/1)

我想取一个路径和元素id。这很简单。

with open('output_file.txt', 'r') as output_file:
for line in output_file:
file_path = line.split()[0]
eId = line.split()[1].split(":")[1]
logging.info("file path:"+file_path)
logging.info("eId:"+eId)
但是,问题是文件名的Path(第一个元素)本身可以包含空格-如文件夹文件或在磁盘上创建的名称中使用空白(这是常见的情况)。所以,我有这些例子:
/home/tools/AMS Provider/file3.txt e:224 k:dir (127/1)
/home/account validator e:227 k:dir (247/1)

所以path总是第一个元素,但有时它包含空格。由于这些例子,我上面的脚本将失败。在给定的示例中:

AMS提供者(子文件夹名称)

account validator(文件名位于路径末尾)

因为,在这种情况下,路径包含空格(在子文件夹名称中,但也在路径末尾的文件名中),我如何仍然可以检索文件的路径。怎么分?

注意:不幸的是,我受限于python2.7在服务器上。谢谢!

我会使用regex:

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility
import re
regex = r"(.*) (e:d*) (k:.*) ((d{3}/d))$"
test_str = ("/home/Plugins/file1 e:222 k:dir (327/1)n"
"/home/Plugins/file2 e:100 k:dir (326/1)n"
"/home/tools/AMS Provider/file3.txt e:224 k:dir (127/1)n"
"/home/account validator e:227 k:dir (247/1)")
matches = re.finditer(regex, test_str, re.MULTILINE)
for matchNum, match in enumerate(matches, start=1):

print ("Match {matchNum} was found at {start}-{end}: {match}".
format(matchNum = matchNum, start = match.start(),
end = match.end(), match = match.group()))

for groupNum in range(0, len(match.groups())):
groupNum = groupNum + 1

print ("Group {groupNum} found at {start}-{end}: {group}".
format(groupNum = groupNum, start = match.start(groupNum),
end = match.end(groupNum), group = match.group(groupNum)))
# Note: for Python 2.7 compatibility, use ur"" to prefix the regex
#       and u"" to prefix the test string and substitution.

输出:

Match 1 was found at 0-39: /home/Plugins/file1 e:222 k:dir (327/1)
Group 1 found at 0-19: /home/Plugins/file1
Group 2 found at 20-25: e:222
Group 3 found at 26-31: k:dir
Group 4 found at 33-38: 327/1
Match 2 was found at 40-79: /home/Plugins/file2 e:100 k:dir (326/1)
Group 1 found at 40-59: /home/Plugins/file2
Group 2 found at 60-65: e:100
Group 3 found at 66-71: k:dir
Group 4 found at 73-78: 326/1
Match 3 was found at 80-134: /home/tools/AMS Provider/file3.txt e:224 k:dir (127/1)
Group 1 found at 80-114: /home/tools/AMS Provider/file3.txt
Group 2 found at 115-120: e:224
Group 3 found at 121-126: k:dir
Group 4 found at 128-133: 127/1
Match 4 was found at 135-178: /home/account validator e:227 k:dir (247/1)
Group 1 found at 135-158: /home/account validator
Group 2 found at 159-164: e:227
Group 3 found at 165-170: k:dir
Group 4 found at 172-177: 247/1

操场。

因为您的filePath将包含e:作为元素id。这样的话,我们可以用这个信息来分割。如下所示

paths = ['/home/tools/AMS Provider/file3.txt e:224 k:dir (127/1)',
'/home/account validator e:227 k:dir (247/1)', '/home/Plugins/file1 e:222 k:dir (327/1)',
'/home/Plugins/file2 e:100 k:dir (326/1)'];
for path in paths:
[filePath, id] = path.split(' e:')
id = id.split(' ')[0]
filePath = filePath.strip()
print([filePath, id])

Python 3:

txt = "/home/tools/AMS Provider/file3.txt e:224 k:dir (127/1)"X = txt。Rsplit (" ", 3)print(x[0])

Result:/home/tools/AMS Provider/file3.txt

Python 2:

txt_list = ["/home/tools/AMS Provider/file3.txt e: 224k:dir (127/1)", "/home/account validator e: 227k:dir (247/1)"]对于txt_list中的x:Result = x.rsplit(" ", 3)打印结果[0]

结果:

/home/tools/AMS Provider/file3.txt
/home/account validator

最新更新