如何构建 numpy 对象数组(对象包含另一个数组)



在另一种语言中,我喜欢使用包含每个类对象的对象数组,并且每个对象都可以通过对象数组非常有效地访问。我正在尝试对Python和numpy做同样的事情。每个对象都有许多不同类型的成员,包括 numpy 数组本身。因此,在最终结果中,我需要所有对象的对象数组,可以有效地访问并返回任何成员,最重要的是成员数组。

我尝试了这样的事情:

class TestClass():
objectarray=np.empty([10, 1], dtype=np.object)  ## static array holding all class objects
def __init__(self,name,position):
self.name=name
self.position=position
self.intmember= 5
self.floatmember=3.4
self.arraymember= np.zeros([5, 5])  ## another array which is a member of the class
TestClass.objectarray[position]=self

然后:

testobj1 = TestClass('test1',5)  ## create a new object and add it at position 5 into the object array

似乎发生了什么事

TestClass.objectarray
array([[None],
[None],
[None],
[None],
[None],
[<__main__.TestClass object at 0x000000EF214DC308>],
[None],
[None],
[None],
[None]], dtype=object)

但是这不起作用:

a= TestClass.objectarray[5]
a.intmember
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-40-dac52811af13> in <module>
1 a= TestClass.objectarray[5]
----> 2 a.intmember
AttributeError: 'numpy.ndarray' object has no attribute 'intmember'

我做错了什么?请记住,这需要成为大循环中的高效机制

(PS(我知道我可以使用对象列表,但是在我的测试中迭代列表的速度非常慢。因此,我想使用 numpy 数组,理想情况下由 numba 增强)

In [1]: class TestClass(): 
...:     objectarray=np.empty([10, 1], dtype=np.object)  ## static array holding all class o
...: bjects 
...:     def __init__(self,name,position): 
...:         self.name=name 
...:         self.position=position 
...:         self.intmember= 5 
...:         self.floatmember=3.4 
...:         self.arraymember= np.zeros([5, 5])  ## another array which is a member of the c
...: lass 
...:         TestClass.objectarray[position]=self 
...:                                                                                        
In [2]: testobj1 = TestClass('test1',5)  

根据定义testobj1具有intmember属性:

In [3]: testobj1                                                                               
Out[3]: <__main__.TestClass at 0x7fceba8acef0>
In [4]: testobj1.intmember                                                                     
Out[4]: 5

该对象也将自身置于类数组中:

In [5]: TestClass.objectarray                                                                  
Out[5]: 
array([[None],
[None],
[None],
[None],
[None],
[<__main__.TestClass object at 0x7fceba8acef0>],
[None],
[None],
[None],
[None]], dtype=object)

由于这是一个 2d 数组,我们使用 2d 索引来引用一个元素:

In [8]: TestClass.objectarray[5,0]                                                             
Out[8]: <__main__.TestClass at 0x7fceba8acef0>
In [9]: TestClass.objectarray[5,0].intmember                                                   
Out[9]: 5

访问时[5]只是在第一个维度上索引;仍然嵌入在数组中的对象:

In [10]: TestClass.objectarray[5]                                                              
Out[10]: array([<__main__.TestClass object at 0x7fceba8acef0>], dtype=object)

我不认为创建一个(10,1)数组有帮助;一个简单的1d也一样好:

objectarray=np.empty([10], dtype=np.object) 

或者只是一个列表:

In [12]: class TestClass(): 
...:     objectarray=[None]*10 
...:     def __init__(self,name,position): 
...:         self.name=name 
...:         self.position=position 
...:         self.intmember= 5 
...:         self.floatmember=3.4 
...:         self.arraymember= np.zeros([5, 5])  ## another array which is a member of the 
...: class 
...:         TestClass.objectarray[position]=self 
...:                                                                                       
In [13]: testobj1 = TestClass('test1',5)                                                       
In [14]: testobj1                                                                              
Out[14]: <__main__.TestClass at 0x7fceac25f5c0>
In [15]: testobj1.objectarray                                                                  
Out[15]: 
[None,
None,
None,
None,
None,
<__main__.TestClass at 0x7fceac25f5c0>,
None,
None,
None,
None]
In [16]: testobj1.objectarray[5]                                                               
Out[16]: <__main__.TestClass at 0x7fceac25f5c0>
In [17]: testobj1.objectarray[5].intmember                                                     
Out[17]: 5

访问列表中的元素比对对象数组执行相同操作更快:

In [18]: timeit Out[5][5,0].intmember                                                          
149 ns ± 0.00964 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [19]: timeit Out[15][5].intmember                                                           
90.5 ns ± 0.0478 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

弗洛皮芬克

我建议np.frompyfunc作为一种方便(如果不是快速)访问或以其他方式处理对象 dtype 数组的方法。 例如

获取intmember值(如果存在)的函数:

In [28]: def getval(item): 
...:     try: 
...:         return item.intmember 
...:     except AttributeError: 
...:         return None         

应用于对象数组:

In [29]: np.frompyfunc(getval,1,1)(Out[5])                                                     
Out[29]: 
array([[None],
[None],
[None],
[None],
[None],
[5],
[None],
[None],
[None],
[None]], dtype=object)

应用于列表:

In [30]: np.frompyfunc(getval,1,1)(Out[15])                                                    
Out[30]: 
array([None, None, None, None, None, 5, None, None, None, None],
dtype=object)

计时:

In [31]: timeit np.frompyfunc(getval,1,1)(Out[15])                                             
14.6 µs ± 187 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [32]: timeit np.frompyfunc(getval,1,1)(Out[5])                                              
9.53 µs ± 54 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [33]: [getval(i) for i in Out[15]]                                                          
Out[33]: [None, None, None, None, None, 5, None, None, None, None]
In [34]: timeit [getval(i) for i in Out[15]]                                                   
6.53 µs ± 93.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

列表上的列表理解速度最快。

最新更新