在Python列表中找到重叠的元素并将其改组



我在python3.x中具有以下元组列表,每个元组都由格式(start, end)的两个整数组成:

list_tuple = [(20, 35), (125, 145), (156, 178), (211, 233), (220, 321), 
                              (227, 234), (230, 231), (472, 498), (4765, 8971)] 
 ## list already sorted except for last tuple

这将元组作为沿实际线的间隔,例如(1,10)是1到10的间隔。

我可以通过第一个元素,仅第二个元素或第二个元素和第二个元素对这个元组进行三种方法。

仅按第一个元素进行排序:

sorted_by_first = sorted(list_tuple, key=lambda element: (element[0]) )  ## (first_element, second_element)

输出

print(sorted_by_first)
[(20, 35), (125, 145), (156, 178), (211, 233), (220, 321), (227, 234), (230, 231), (472, 498), (4765, 8971)]

和基于第二个元素进行排序:

sorted_by_second = sorted(list_tuple, key=lambda element: (element[1]) )

输出

print(sorted_by_second)
[(20, 35), (125, 145), (156, 178), (230, 231), (211, 233), (227, 234), (220, 321), (472, 498), (4765, 8971)]

和两个:

sorted_by_both = sorted(list_tuple, key=lambda element: (element[0], element[1]) )

输出

print(sorted_by_both)
[(20, 35), (125, 145), (156, 178), (211, 233), (220, 321), (227, 234), (230, 231), (472, 498), (4765, 8971), ...]

请注意,这些排序的输出中的每一个都是不同的顺序。那些在订购方面不同的元组是"重叠的间隔",例如应在(230, 231)之前或之后放置(227, 234),因为这些间隔重叠。

我的目的是创建一个函数,该函数(1(搜索"重叠间隔"的排序输出,然后(2(随机将它们随机置于彼此之间。

我可以想到一个输出所有与给定元组重叠的元组的功能,例如

def find_overlaps(input_tuple_list, search_interval):
    results = []
    for tup in input_tuple_list:
        if ((tup[0] >= search_interval[0] and tup[0] <= search_interval[1]) or (tup[1] >= search_interval[0] and tup[1] <= search_interval[1])):
            results.append(tup)
    return results

工作如下

foo = (130, 150)
overlapping_foo = find_overlaps(list_tuple, foo)
print(overlapping_foo)
[(125, 145)]

但是,为了实现目标(1(,我需要编写一个函数,该函数在list_tuple中找到所有重叠的元组。

我尝试的是:我最初认为我可以与自己一起搜索原始元组,例如

total_overlaps = []
for tupp in list_tuple:
    total_overlaps.append(find_overlaps(list_tuple, tupp))

这显然是错误的,因为输出是原始元组本身。

更大的问题是我看不到如何执行目标(2(。我只能洗牌/重新排序彼此重叠的元组。假设我有一个从(1(中找到的重叠元素的列表:

overlap_list = [(211, 233), (220, 321), (227, 234), (230, 231), (6491, 7000), (6800, 7200)]

以下列表理解失败

from random import shuffle
reordered = [shuffle(tupp) for tupp in overlap_list]

给予

TypeError: 'tuple' object does not support item assignment

也很重要,我不会用(211, 233)(6491, 7000)散布,因为这些无关。

我如何在元组列表中找到重叠的间隔,然后单独将这些元组彼此重叠。

请注意,请确保我了解您的要求。但是您可以使用itertools配方pairwise配对元素,然后使用itertools.groupby(),将顺序重叠分组,即从(6491, 7000)分开(211, 233)

import itertools as it
def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = it.tee(iterable)
    next(b, None)
    return zip(a, b)
>>> overlap_list = [(211, 233), (220, 321), (227, 234), (230, 231), (6491, 7000), (6800, 7200)]
>>> [list(p) for k, p in it.groupby(pairwise(overlap_list), lambda x: x[0][0] < x[1][0] < x[0][1]) if k]
[[((211, 233), (220, 321)), ((220, 321), (227, 234)), ((227, 234), (230, 231))],
 [((6491, 7000), (6800, 7200))]]

您可以使用以下方式unpairwise这些列表:

def unpairwise(iterable):
    a, b = zip(*iterable)
    yield a[0]
    yield from b

so:

>>> [list(unpairwise(p)) for k, p in it.groupby(pairwise(overlap_list), lambda x: x[0][0] < x[1][0] < x[0][1]) if k]
[[(211, 233), (220, 321), (227, 234), (230, 231)], [(6491, 7000), (6800, 7200)]]

从@achampion中扩展答案,应该很容易地将重叠元素列表列出以获取您想要的东西:

>>> overlaps = [[(211, 233), (220, 321), (227, 234), (230, 231)], [(6491, 7000), (6800, 7200)]]
>>> for x in overlaps: 
...     random.shuffle(x)
...
>>> overlaps
[[(227, 234), (230, 231), (220, 321), (211, 233)], [(6491, 7000), (6800, 7200)]]
>>> for x in overlaps:
...     random.shuffle(x)
...
>>> overlaps
[[(220, 321), (227, 234), (230, 231), (211, 233)], [(6491, 7000), (6800, 7200)]]
>>> for x in overlaps:
...     random.shuffle(x)
...
>>> overlaps
[[(227, 234), (211, 233), (220, 321), (230, 231)], [(6800, 7200), (6491, 7000)]]

请注意,random.shuffle就位。

相关内容

  • 没有找到相关文章

最新更新