在两个2D(或3D)Numpy阵列上获取相交(常见)索引

我有来自两个不同目录的数据，我想使用坐标来匹配这两个目录。我拥有的数据是目录1中的x1,y1,z1,a1,b1,c1,etc（约500万个元素），以及目录2的x2,y2,z2,a2,e2,m2,n2,etc（约有百万个元素）。必要的我将扩展到（x，y，z），并比较2D阵列以找到相同的元素。

co1 = np.vstack((x1,y1)).T
co2 = np.vstack((x2,y2)).T
idx1 = np.in1d(co1,co2)   # not working for 2D arrays
idx2 = np.in1d(co2,co1)
np.savetxt('combined_data.txt',np.c_[x1[idx1],y1[idx1],a1[idx1],e2[idx2],n2[idx2]],fmt='%1.4f   %1.4f   %1.4f   %1.4f   %1.4f')

例如，我有以下数据集：

x1 = np.array([1,2,3,4,5])
y1 = np.array([5,4,3,2,1])
x2 = np.array([1,4,6,2,6,4,8,9,3])
y2 = np.array([5,1,5,3,6,2,8,3,3])
(1,5), (3,3), (4,2) are the common coordinates between the two catalogs. Therefore,
idx1 = [Ture, False, True, True, False], idx2 = [True, False, False, False, False, True, False, False, True].

，但问题是np.in1d是1D例程，不能将其应用于2D或3D数组。有人知道完成此任务的一些Numpy例程吗？

将两个数组转换为pandas dataframes：

df1 = pd.DataFrame({"x" : x1, "y" : y1})).reset_index()

合并它们：

result = pd.merge(df1, df2, left_on=["x","y"], right_on=["x","y"])
#   index_x  x  y  index_y
#0        0  1  5        0
#1        2  3  3        8
#2        3  4  2        5

获取索引：

result[["index_x","index_y"]]
#   index_x  index_y
#0        0        0
#1        2        8
#2        3        5

相关内容

最新更新

热门标签：