Numpy 连接 + 合并 1D 数组



我需要连接数组,但如果它们重叠,还需要将 A 的结尾与 B 的开头合并。

[1, 2, 4] + [2, 4, 5] -> [1, 2, 4, 5]
[1, 2, 4] + [2, 5, 4] -> [1, 2, 4, 2, 5, 4]
[1, 2, 4] + [1, 2, 4, 5] -> [1, 2, 4, 5]

注意:必须保留元素的顺序,[4, 5] 与 [5, 4] 不同。

注2:这个问题也可以这样理解:我们需要A的最短扩展,以便输出以B结尾。

当然,我可以遍历第二个数组并逐个元素进行比较,但我正在寻找一个不错的 Numpy 解决方案。

最初误解了这个问题。问题,来自我的理解:

Two item suffix of A matches 2 item prefix of B:
[1, 2, 4] +
[2, 4, 5] =>
[1, 2, 4, 5]
No suffix of A matches a prefix of B:
[1, 2, 4] + 
[2, 5, 4] -> 
[1, 2, 4, 2, 5, 4]

然后我们可以使用这个非常低效的函数:

def merge(A,B):
i = 0
m = 0
# Find largest suffix of A that matches the prefix of B with the same length
while i <= len(A):
if A[-i:] == B[:i] and i > m:
m = i
i += 1
return A + B[m:]

下面是使用NumPy的解决方案。这并不理想,因为它需要(可能不需要的(排序和迭代。排序和迭代都应该在一个相对较小的数组(甚至单个元素(上进行。

import numpy as np
def merge(left, right):
"""Concatenating two arrays, merging the overlapping end and start of
the left and right array"""
# We can limit the search to the maximum possible overlap between
# the arrays, which is the minimum of the two lengths
l = min(len(left), len(right))
# Find all indices in `right` where the element matches the last element of `left`.
# Need to sort, since the `nonzero` documentation doesn't
# explicitly state whether the returned indices follow the order
# as in `right`
# As long as there are few matches, sorting will not be a showstopper
# Need to reverse the sorted array, to start from the back of the
# right array, work towards the front, until there is a proper match
for i in np.sort(np.nonzero(right[:l] == left[-1])[0])[::-1]:
# Check if the subarrays are equal
if np.all(left[-i-1:] == right[:i+1]):
return np.concatenate([left, right[i+1:]])
# No match
return np.concatenate([left, right])

a = np.array([1, 2, 4])
b = np.array([2, 4, 5])
c = np.array([2, 5, 4])
d = np.array([1, 2, 4, 5])
e = np.array([1, 2, 4, 2])
f = np.array([2, 4, 2, 5])
print(merge(a, b))
print(merge(a, c))
print(merge(a, d))
print(merge(e, b))
print(merge(e, f))

这产生

[1 2 4 5]
[1 2 4 2 5 4]
[1 2 4 5]
[1 2 4 2 4 5]
[1 2 4 2 5]

我有一个O(n)的解决方案,尽管没有 Numpy:

def merge(a, b):
n_a = len(a)
n = min(n_a, len(b))
m = 0
for i in range(1, n + 1):
if b[n - i] == a[n_a - 1 - m]:
m += 1
else:
m = 0
return a + b[m:]

你可以这样做。

def concatenate(a,b):
ret = a.copy()
for element in b:
if not element in ret:
ret.append(element)
return ret

这使订单保持在 a + b 形式。

最新更新