如何用python中流体使用的向量属性包装C++类/结构



我正在包装一个用于Python的C++API,所以我希望Python包装类的功能与C++类的功能非常接近。在一个例子中,我有两个实际上是嵌套结构的对象:

// myheader.hpp
#include <vector>
namespace mynames{
struct Data{
    struct Piece{
        double piece1;
        int piece2;
        std::vector<double> piece3;
    };
    std::vector<Piece> pieces;
};
}

我想在Python中流畅地与这个对象交互,就好像它是一个使用numpy和扩展类型的典型Python类一样。因此,我首先声明了两种扩展类型:

# mydeclarations.pxd
from libcpp.vector cimport vector
cdef extern from "myheader.hpp" namespace "mynames":
    cdef cppclass Data:
        vector[Piece] pieces
cdef extern from "myheader.hpp" namespace "mynames::Data":
    cdef cppclass Piece:
        double piece1
        int piece2
        vector[double] piece3

然后用Python包装:

# mytypes.pyx
cimport mydeclarations as cpp
from cython cimport NULL
from libcpp.vector cimport vector
import numpy as np
cdef class Piece:
    cdef cpp.Piece *_cppPiece
    def __cinit__(self):
            self._cppPiece = new cpp.Piece()
    def __dealloc__(self):
        if self._cppPiece is not NULL:
            del self._cppPiece
    @property
    def piece1(self):
        return self._cppPiece.piece1
    @piece1.setter
    def piece1(self, double d):
        self._cppPiece.piece1 = d
    @property
    def piece2(self):
        return self._cppPiece.piece2
    @piece2.setter
    def piece2(self, int i):
        self._cppPiece.piece2 = i
    # Use cython's automatic type conversion: (cpp)vector <---> (py)list (COPIES)
    @property
    def piece3(self):
        return np.asarray(self._cppPiece.piece3, dtype=np.double)
    @piece3.setter
    def piece3(self, arr):
        self._cppPiece.piece3 = <vector[double]>np.asarray(arr, dtype=np.double)
#----------------------------------------------------------------------
cdef class Data:
    cdef cpp.Data *_cppData
    def __cinit__(self):
        self._cppData = new cpp.Data()
    def __dealloc__(self):
        if self._cppData is not NULL:
            del self._cppData
    @property
    def pieces(self):
        # Create a list of Python objects that hold copies of the C++ data
        cdef Piece pyPiece
        cdef int i
        pyPieces = []
        for i in range(self._cppData.pieces.size()):
            pyPiece = Piece()
            pyPiece._cppPiece[0] = self._cppData.pieces.at(i)
            pyPieces.append(pyPiece)
        return np.asarray(pyPieces, dtype=Piece)
    @pieces.setter
    def pieces(self, arr):
        # Clear the existing vector and create a new one containing copies of the data in arr
        cdef Piece pyPiece
        self._cppData.pieces.clear()
        for pyPiece in arr:
            self._cppData.pieces.push_back(deref(pyPiece._cppPiece))

这是一个简单的实现,据我所知,它是有效的,但也存在一些问题:

  1. 由于我们使用副本,因此,如果Piece((.piece3是一个包含numpy数组的python类属性,则没有您可能期望的就地功能。例如
a = Piece()
a.piece3 = [1,2,3]
a.piece3[0] = 55 # No effect, need to do b = a.piece3; b[0]=55; a.piece3=b
  1. 有很多数据迭代和复制。当Data.pieces的大小非常大时,这可能是一个问题

有人能提出一些更好的替代方案来解决这些问题吗?虽然Data比Pieces更复杂,但我认为它们是相关的,并最终归结为用矢量属性包装C++类以在Python中使用。

如果您想避免数据复制,那么可能需要创建一个包装类。

cdef class DoubleVector:
    cdef vector[double] *vec
    cdef owner
    def __dealloc__(self):
        if owner is not None:
            del self.vec
    @staticmethod
    cdef create_from_existing_Piece(Piece obj):
        out = DoubleVector()
        out.owner = obj
        out.vec = obj._cppData.piece3
        return out
    # create len/__getindex__/__setindex__ functions
    # You could also have this class expose the buffer protocol

在这里,我假设DoubleVector在大多数情况下并不拥有自己的数据。因此,它保留了对拥有该数据的C++类的Python类的引用(从而确保对象的生存期得到保留(。一些细节(主要是创建一个漂亮的序列接口(留给你填写


暴露vector[Piece]更困难,主要是因为对向量的任何更改(包括调整其大小(都会使指向向量的任何指针无效。因此,我会认真考虑使用与C++接口不同的Python接口。

  • 你能让Data不可变吗(这样你就不能从Python中更改它,这样就可以安全地将指针返回到它中(
  • 您是否可以避免从数据中返回Piece,并具有类似get_ith_piece1get_ith_piece2get_ith_piece3的函数(即从Python包装中移除一层(

或者你可以做一些类似的事情

cdef class BasePiece:
   cdef cpp.Piece* get_piece(self):
       raise NotImplementedError
   # most of the implementation of your Piece class goes here
cdef class Piece(BasePiece):
   # wrapper that owns its own data.
   # largely as before but with
   cdef cpp.Piece* get_piece(self):
       return self._cppPiece
   # ...
cdef class UnownedPiece(BasePiece):
   cdef Data d
   cdef int index
   cdef cpp.Piece* get_piece(self):
       return self.d._cppClass.pieces[index]

如果矢量的内容发生变化(它不指向现有的Piece,而仅指向索引位置(,这至少是安全的。你显然需要小心改变尺寸。

Data.pieces的getter函数可能类似于

@property
def pieces(self):
    l = []
    for i in range(self.pieces.size()):
        l.append(UnownedPiece(self.pieces[i], self))
    return tuple(l)  # convert to tuple so it's immutable and people
        # won't be tempted to try to append to it.

很明显,您可以采用许多其他方法,但您可以使用这种方法创建一个相当不错的界面。

主要是:尽可能多地限制Python接口。

最新更新