Python 拆分函数不适用于列表以生成列表列表

我正在学习python，并做了以下实验。

text = "this is line one . this is line two . this is line three ."

tokens = text.split(" ")            # split text into token with seperator "space"
lioftokens = tokens.split(".")      # split tokens into list of tokens with seperator "dot"

print(tokens)                       # output = ['this', 'is', 'line', 'one', '.', 'this', 'is', 'line', 'two', '.', 'this', 'is', 'line', 'three', '.']
print(lioftokens)                   # expected output = [['this', 'is', 'line', 'one', '.'],
#                    ['this', 'is', 'line', 'two', '.'],
#                    ['this', 'is', 'line', 'three', '.']]

给出错误而不是预期的输出。

split()是字符串，不是列表。我该如何解决它?

# IamNewToPython

尝试使用list推导式:

text = "this is line one . this is line two . this is line three ."
print([line.rstrip().split() for line in text.split('.') if line])

输出:

[['this', 'is', 'line', 'one'], ['this', 'is', 'line', 'two'], ['this', 'is', 'line', 'three']]

如果您想保留分隔符，请尝试:

import re
text = "this is line one . this is line two . this is line three ."
print([line.rstrip().split() for line in re.split('([^.]*.)', text) if line])

输出:

[['this', 'is', 'line', 'one', '.'], ['this', 'is', 'line', 'two', '.'], ['this', 'is', 'line', 'three', '.']]

编辑:

如果你想分割列表，试试:

l = ['this', 'is', 'line', 'one', '.', 'this', 'is', 'line', 'two', '.', 'this', 'is', 'line', 'three', '.']
newl = [[]]
for i in l:
newl[-1].append(i)
if i == '.':
newl.append([])
print(newl)

输出:

[['this', 'is', 'line', 'one', '.'], ['this', 'is', 'line', 'two', '.'], ['this', 'is', 'line', 'three', '.'], []]

这行得通:

>>> text = "this is line one . this is line two . this is line three ."
>>> list(filter(None, map(str.split, text.split("."))))
[['this', 'is', 'line', 'one'],
['this', 'is', 'line', 'two'],
['this', 'is', 'line', 'three']]

您可以简单地先按.拆分列表，然后简单地按map和str.split拆分列表中的每个单独字符串。

text = "this is line one . this is line two . this is line three ."
# first split on the periods
sentences = text.split('.')
for s in sentences:
# chop off trailing whitespace and then split on spaces
print(s.rstrip().split())

str.split()方法。

text = "this is line one . this is line two . this is line three ."
print([text.split()[i:i+5] for i in range(0,len(text.split()),5) ])

相关内容

最新更新

热门标签：