从 numpy 中的二维直方图中检索箱数据



我已经设法使用numpy.histogram2d()将大约 200 分分配到箱中。但是,我无法弄清楚的是,如何访问每个箱中存储的值。

知道该怎么做吗?

来自 numpy 文档:

import numpy as np
xedges = [0, 1, 1.5, 3, 5]
yedges = [0, 2, 3, 4, 6]
x = np.random.normal(3, 1, 100)
y = np.random.normal(1, 1, 100)
H, xedges, yedges = np.histogram2d(y, x, bins=(xedges, yedges))

H包含二维直方图值。如果xedges的长度为m,长度为yedges n,则H将具有(m-1, n-1)形状

您还可以指定每个维度的条柱数:

x = np.random.normal(3, 1, 100)
y = np.random.normal(1, 1, 100)
H, xedges, yedges = np.histogram2d(y, x, bins=(5, 6))

然后,H的形状将与您在bins关键字中提供的形状相同:(5, 6)

我目前面临同样的挑战,但我还没有在网上或文档中找到任何解决方案。

所以这是我想出的:

# Say you have the following coordinate points:
data = np.array([[-73.589,  45.490],
             [-73.591,  45.497],
             [-73.592,  45.502],
             [-73.574,  45.531],
             [-73.552,  45.534],
             [-73.570,  45.512]])
# These following variables are to determine the range we want for the bins. I use 
# values a bit wider than my max and min values for x and y
extenti = (-73.600, -73.540)
extentj = (45.480, 45.540)
# Run numpy's histogram2d function to return two variables we'll be using 
# later: hist and edges
hist, *edges = np.histogram2d(data[:,0], data[:,1], bins=4, range=(extenti, extentj))
# You can visualize the histogram using matplotlibs's own 2D-histogram:
plt.hist2d(data[:,0], data[:,1], bins=4)
# We'll use numpy's digitize now. According to Numpy's documentarion, numpy.digitize 
# returns the indices of the bins to which each value in input array belongs. However 
# I haven't managed yet to make it work well for the problem we have of 2d histograms. 
# You might manage to, but for now, the following has been working well for me:
# Run np.digitize once along the x axis of our data, and using edges[0].
# edges[0] contains indeed the x axis edges of the numpy.histogram2d we
# made earlier. This will the x-axis indices of bins containing data points. 
hitx = np.digitize(data[:, 0], edges[0])
# Now run it along the y axis, using edges[1]
hity = np.digitize(data[:, 1], edges[1])
# Now we put those togeter.
hitbins = list(zip(hitx, hity))
# And now we can associate our data points with the coordinates of the bin where
# each belongs
data_and_bins = list(zip(data, hitbins))

从那里,我们可以通过坐标选择一个箱,并找到与之关联的数据点!

你可以做这样的事情:

[item[0] for item in data_and_bins if item[1] == (1, 2)]

其中 (1, 2) 是要从中检索数据的箱的坐标。在我们的例子中,那里有两个数据点,它们将在上面的行列出。

请记住,我们使用的 np.digitize() 表示 0 或 len(bins) 的越界,这意味着第一个箱的坐标为 (1, 1) 而不是 (0, 0)

如果您和 numpy 就什么是"第一个"垃圾箱达成一致,也请记住。我相信它从左下角开始计算到右上角。但我可能在那里弄错了。

希望这对您或其他遇到此挑战的人有所帮助。

我也检查了很多这个问题。特别是试图从图像中收集信息,这是matplotlib的hist2d的输出之一,但它总是失败的。然后我写了这个,循环循环。我知道这仍然是蛮力,甚至不是一个优雅的解决方案,但它仍然可以在某些时候让某人的生活更轻松。在这里:

for bin_fl in range(nbins):
    fl_elm = []
    Pprom_elm = []
    for elm in range(len(Array_x_axis)):
        if Width_t[elm]<=xedges[bin_fl+1]: # +1 is needed since the first 
            fl_elm.append(elm)             # element of xedges is zero
    fl_elm=np.array(fl_elm)
    for elem in fl_elm:
        Pprom_elm.append(Pprom_t[elem])
    Pprom_elm=np.array(Pprom_elm)

因此,我首先获得与 xbin 中的元素对应的 bin 索引。然后获取这些索引以查找另一个轴的相应值。享受!

我刚刚在 matplotlib 手册中尝试了这个例子

请注意hist, xedges, yedges = np.histogram2d(x, y, bins=4)

该方法有三个输出值,其中 hist 是一个 2D 数组,其中值位于箱中;与传递给 imshow 以绘制此直方图的投影相同。

最新更新