我需要连接数组,但如果它们重叠,还需要将 A 的结尾与 B 的开头合并。
[1, 2, 4] + [2, 4, 5] -> [1, 2, 4, 5]
[1, 2, 4] + [2, 5, 4] -> [1, 2, 4, 2, 5, 4]
[1, 2, 4] + [1, 2, 4, 5] -> [1, 2, 4, 5]
注意:必须保留元素的顺序,[4, 5] 与 [5, 4] 不同。
注2:这个问题也可以这样理解:我们需要A的最短扩展,以便输出以B结尾。
当然,我可以遍历第二个数组并逐个元素进行比较,但我正在寻找一个不错的 Numpy 解决方案。
最初误解了这个问题。问题,来自我的理解:
Two item suffix of A matches 2 item prefix of B:
[1, 2, 4] +
[2, 4, 5] =>
[1, 2, 4, 5]
No suffix of A matches a prefix of B:
[1, 2, 4] +
[2, 5, 4] ->
[1, 2, 4, 2, 5, 4]
然后我们可以使用这个非常低效的函数:
def merge(A,B):
i = 0
m = 0
# Find largest suffix of A that matches the prefix of B with the same length
while i <= len(A):
if A[-i:] == B[:i] and i > m:
m = i
i += 1
return A + B[m:]
下面是使用NumPy的解决方案。这并不理想,因为它需要(可能不需要的(排序和迭代。排序和迭代都应该在一个相对较小的数组(甚至单个元素(上进行。
import numpy as np
def merge(left, right):
"""Concatenating two arrays, merging the overlapping end and start of
the left and right array"""
# We can limit the search to the maximum possible overlap between
# the arrays, which is the minimum of the two lengths
l = min(len(left), len(right))
# Find all indices in `right` where the element matches the last element of `left`.
# Need to sort, since the `nonzero` documentation doesn't
# explicitly state whether the returned indices follow the order
# as in `right`
# As long as there are few matches, sorting will not be a showstopper
# Need to reverse the sorted array, to start from the back of the
# right array, work towards the front, until there is a proper match
for i in np.sort(np.nonzero(right[:l] == left[-1])[0])[::-1]:
# Check if the subarrays are equal
if np.all(left[-i-1:] == right[:i+1]):
return np.concatenate([left, right[i+1:]])
# No match
return np.concatenate([left, right])
a = np.array([1, 2, 4])
b = np.array([2, 4, 5])
c = np.array([2, 5, 4])
d = np.array([1, 2, 4, 5])
e = np.array([1, 2, 4, 2])
f = np.array([2, 4, 2, 5])
print(merge(a, b))
print(merge(a, c))
print(merge(a, d))
print(merge(e, b))
print(merge(e, f))
这产生
[1 2 4 5]
[1 2 4 2 5 4]
[1 2 4 5]
[1 2 4 2 4 5]
[1 2 4 2 5]
我有一个O(n)
的解决方案,尽管没有 Numpy:
def merge(a, b):
n_a = len(a)
n = min(n_a, len(b))
m = 0
for i in range(1, n + 1):
if b[n - i] == a[n_a - 1 - m]:
m += 1
else:
m = 0
return a + b[m:]
你可以这样做。
def concatenate(a,b):
ret = a.copy()
for element in b:
if not element in ret:
ret.append(element)
return ret
这使订单保持在 a + b 形式。