Numpy数组(矩阵),单轴索引为字符串



在numpy中,可以创建一个矩阵并使用方便的切片表示法

arr=np.array([[1,2,3], [4,5,6], [7, 8, 9], [10,11,12]])
print (arr[2, :])
print (arr[1:2, 2])

这可以扩展到N个维度。

但现在,如果我希望有一个相同的轴,但一个轴不是数字轴,而是基于字符串的轴,该怎么办?所以索引一个元素应该是这样的:

print(arr["cylinder", :, :]) #prints all cylinders
print(arr["sphere", 4, 100]) #prints sphere of 4 radius, 100 bar
print(arr[:, 4, 100]) #prints every shape with 4 radius 100 bar

我可以为每个"组合"(所有形状,特定半径,特定压力…所有形状,所有半径,特定压强…特定形状,特定直径,特定压强)制作。一个唯一的函数,但这是不可行的,那么我该如何创建它呢?

目前,所有内容都存储为字典的字典(尤其是因为只使用半径和压力的值)。如果底层存储可以作为字典的字典保存,但添加切片/索引运算符将是黄金!


当前代码(是的,我确实有想法研究kwargs,使当前代码库更好地添加新点)-这只是为了防止"NP"问题:

class all_measurements(object):
    def __init__(self):
        self.measurements = {}
    def add_measurement(self, measurement):
        shape = measurement.shape
        size = measurement.size
        pressure = measurement.pressure
        fname = measurement.filename
        if shape in self.measurements:
            shape_dict = self.measurements[shape]
        else:
            shape_dict = {}
            self.measurements[shape] = shape_dict
        if size in shape_dict:
            size_dict = shape_dict[size]
        else:
            size_dict ={}
            shape_dict[size] = size_dict
        if pressure in size_dict:
            pressure_dict = size_dict[pressure]
        else:
            pressure_dict = {}
            size_dict[pressure] = pressure_dict
        if fname in pressure_dict:
            print("adding same file twice!")
        pressure_dict[fname] = measurement
    def get_measurements(self, shape = None, size = None, pressure = None, fname = None):
        current_dict = self.measurements
        if shape is None:
            return current_dict
        if shape in current_dict:
            current_dict = current_dict[shape]
        else:
            return None
        if size is None:
            return current_dict
        if size in current_dict:
            current_dict = current_dict[size]
        else:
            return None
        if pressure is None:
            return current_dict
        if pressure in current_dict:
            current_dict = current_dict[pressure]
        else:
            return None
        if fname is None:
            return current_dict
        if fname in current_dict:
            return current_dict[fname]
        else:
            return None

我认为您在寻找结构化数组,请参阅此处。

示例:

>>> import numpy as np
>>> a = np.zeros(10,dtype={'names':['a','b','c'],'formats':['f64','f64','f64']})
# write some data in a
>>> a['a'] = np.arange(10)
>>> a['b'] = np.arange(10,20)
>>> a['c'] = np.arange(20,30)
>>> a
array([(0.0, 10.0, 20.0), 
       (1.0, 11.0, 21.0), 
       (2.0, 12.0, 22.0),
       (3.0, 13.0, 23.0), 
       (4.0, 14.0, 24.0), 
       (5.0, 15.0, 25.0),
       (6.0, 16.0, 26.0), 
       (7.0, 17.0, 27.0), 
       (8.0, 18.0, 28.0),
       (9.0, 19.0, 29.0)], 
  dtype=[('a', '<f4'), ('b', '<f4'), ('c', '<f4')])
>>> a['a'][2:6]
array([ 2.,  3.,  4.,  5.], dtype=float32)
>>> a[4:8]
array([(4.0, 14.0, 24.0), 
       (5.0, 15.0, 25.0), 
       (6.0, 16.0, 26.0),
       (7.0, 17.0, 27.0)], 
  dtype=[('a', '<f4'), ('b', '<f4'), ('c', '<f4')])

重复使用类似的模式

    if shape in self.measurements:
        shape_dict = self.measurements[shape]
    else:
        shape_dict = {}
        self.measurements[shape] = shape_dict

建议您可以使用collections.defaultdict获利。

当我用一些measurements填充你的all_measurements对象时(使用我自己的简单类),

A = all_measurements()
A.add_measurement(measurement('round',10,20.0,'test0'))
A.add_measurement(measurement('square',10,30.0,'test1'))
A.add_measurement(measurement('round',1,20.0,'test2'))
print(A.measurements)

我有一本字典,看起来像:

{'square': {10: {30.0: {'test1': measurement: square,10,30.0,test1}}},
 'round': {1: {20.0: {'test2': measurement: round,1,20.0,test2}}, 
           10: {20.0: {'test0': measurement: round,10,20.0,test0}}}}

我在这里没有看到任何看起来像3d阵列的东西。

我想如果有一套标准的形状、尺寸和压力,例如

shapes = ['round', 'square', 'flat']
sizes = [1,3,10,20]
pressures = [10.0, 20.0, 30.0]

你可以构建一个三维阵列,例如

np.empty((3,4,3))

以及将标签映射到索引的字典或元组列表,例如

sizemap={1:0, 3:1, 10:2, 20:3}
sizelist=[(1,0),(3,1)...]

但是这个数组的值是多少呢?measurement对象?对象类型的ndarrays是可能的,但通常与嵌套列表或字典相比没有优势。


我测试了你的get_measurements。按照现在的结构,你必须选择形状,然后在这些形状中选择大小等。它不能返回具有特定大小值的所有形状。

这个方法,让我使用索引,包括切片,语法来将参数传递给get_measurements:

def __getitem__(self, key):
    print(key)
    key = list(key)  # comes in a tuple
    for i,k in enumerate(key):
        if isinstance(k, slice):
            # code to interpret a slice goes here
            key[i] = None # fall back, do nothing
    return self.get_measurements(*key)
pprint(A['round',10])
pprint(A[:,10])
pprint(A['round':'square', 10:30:10])

产生

('round', 10)
{20.0: {'test0': measurement: round,10,20.0,test0}}
(slice(None, None, None), 10)
{'round': ...}
(slice('round', 'square', None), slice(10, 30, 10))
{'round': ...}

你必须决定像这样的物体

slice('round','square', None)
slice(10, 30, 10)

在属性的上下文中的平均值

最新更新