如何从嵌套列表中分块n个分段



我正试图从一个嵌套列表中分块100个列表。我已经查看了Stack Overflow上的多个示例,但我仍然无法正常工作。

我的主列表名为data_to_insert,它包含其他列表。我想从主要嵌套列表中提取(块(100个列表。

我该如何做到这一点?

这是我当前的代码,无法按需工作。

def divide_chunks(l, n):
for i in range(0, len(l), n):
yield l[i:i + n]
n = 100
x = list(divide_chunks(data_to_insert, 100)) 

嵌套列表示例:

data_to_insert = [['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'],
...
[thousands of others lists go here]]

所需的输出是另一个列表(sliced_data(,其中包含嵌套列表(data_to_insert(中的100个列表。

sliced_data = [['item1','item2','item3','item4','item5','item6'],
['item1','item2','item3','item4','item5','item6'], 
...
[98 more lists go here]]

我需要循环浏览嵌套列表data_to_insert,直到它为空。

您可以使用random从给定列表中选择100随机嵌套列表。

这将从原始列表输出3随机嵌套列表

import random
l = [[1,2], [3,4], [1,1], [2,3], [3,5], [0,0]]
print(random.sample(l, 3))

# output,
[[3, 4], [1, 2], [2, 3]]

如果不需要列表输出,则将print(random.sample(l, 3))替换为print(*random.sample(l, 3))

# output,
[1, 2] [2, 3] [1, 1]

如果你只想要第一个100嵌套列表,那么就这样做,

print(l[:100])

如果我没有正确理解你的问题,你需要首先压平你的列表,然后创建一个块。下面是一个使用itertools module中的chain.from_iterable以及你用来创建块的代码的例子:

from itertools import chain
def chunks(elm, length):
for k in range(0, len(elm), length):
yield elm[k: k + length]

my_list = [['item{}'.format(j) for j in range(7)]] * 1000
flattened = list(chain.from_iterable(my_list))
chunks = list(chunks(flattened, 100))
print(len(chunks[10]))

输出:

100

经过一些耗时的研究,我开发了一个行之有效的解决方案。下面的解决方案循环浏览列表并提取100个列表。

# Verifies that the list data_to_insert isn't empty
if len(data_to_insert) > 0:
# Obtains the length of the data to insert.
# The length is the number of sublists
# contained in the main nestled list.
data_length = len(data_to_insert)
# A loop counter used in the
# data insert process.
i = 0
# The number of sublists to slice
# from the main nestled list in
# each loop.
n = 100
# This loop execute a set of statements
# as long as the condition below is true
while i < data_length:
# Increments the loop counter
if len(data_to_insert) < 100:
i += len(data_to_insert)
else:
i += 100
# Slices 100 sublists from the main nestled list.
sliced_data = data_to_insert[:n]
# Verifies that the list sliced_data isn't empty
if len(sliced_data) > 0:
# Removes 1000 sublists from the main nestled list.
data_to_insert = data_to_insert[n:]
##################################
do something with the sliced_data
##################################
# Clears the list used to store the
# sliced_data in the insertion loop.
sliced_data.clear()
gc.collect()
# Clears the list used to store the
# data elements inserted into the
# database.
data_to_insert.clear()
gc.collect()

我开发了第二种方法来实现我的目标,它是基于Sufiyan Ghori关于使用random.的建议

if len(my_nestled_list) > 0:
# Obtains the length of the data to insert.
# The length is the number of sublists
# contained in the main nestled list.
data_length = len(my_nestled_list))
# A loop counter used in the
# data insert process.
i = 0
# The number of sublists to slice
# from the main nestled list in
# each loop.
n = 100
# This loop execute a set of statements
# as long as the condition below is true
while i < data_length:
# Increments the loop counter
if len(my_nestled_list)) < 100:
i += len(my_nestled_list))
else:
i += 100
# Uses list comprehension to randomly select 100 lists 
# from the nestled list.  
random_sample_of_100 = [my_nestled_list)[i] for i in sorted(random.sample(range(len(my_nestled_list))), n))]
print (random_sample_of_100)

最新更新