在什么时候我可以将数组传递回我的Rust程序以释放它的内存



我很难弄清楚在什么时候可以将Rust程序返回的BNG_FFIArray传递回它,以释放分配给它的内存。

我的ctypes设置如下:

class BNG_FFITuple(Structure):
    _fields_ = [("a", c_uint32),
                ("b", c_uint32)]
class BNG_FFIArray(Structure):
    _fields_ = [("data", c_void_p),
                ("len", c_size_t)]
    # Allow implicit conversions from a sequence of 32-bit unsigned
    # integers.
    @classmethod
    def from_param(cls, seq):
        return seq if isinstance(seq, cls) else cls(seq)
    def __init__(self, seq, data_type = c_float):
        array_type = data_type * len(seq)
        raw_seq = array_type(*seq)
        self.data = cast(raw_seq, c_void_p)
        self.len = len(seq)
# A conversion function that cleans up the result value to make it
# nicer to consume.
def bng_void_array_to_tuple_list(array, _func, _args):
    res = cast(array.data, POINTER(BNG_FFITuple * array.len))[0]
    return res
convert_bng = lib.convert_vec_c
convert_bng.argtypes = (BNG_FFIArray, BNG_FFIArray)
convert_bng.restype = BNG_FFIArray
convert_bng.errcheck = bng_void_array_to_tuple_list
# this is the FFI function I'd like to call. It takes a BNG_FFIArray as its argument
drop_array = lib.drop_array 
drop_array.argtypes = (BNG_FFIArray,)

def convertbng(lons, lats):
    """ just a wrapper """
    return [(i.a, i.b) for i in iter(convert_bng(lons, lats))]
# pass values into the FFI rust function
convertbng([-0.32824866], [51.44533267])

这一切都正常工作,但我不确定应该在什么时候通过调用drop_array,将最初由调用分配给lib.convert_to_bng的数据返回到FFI边界,以释放其相关内存。

这是我的Rust结构和函数。

#[repr(C)]
pub struct Array {
    data: *const c_void,
    len: libc::size_t,
}
#[no_mangle]
pub extern "C" fn drop_array(arr: Array) {
    unsafe { Vec::from_raw_parts(arr.data as *mut u8, arr.len, arr.len) };
}
impl Array {
    unsafe fn as_f32_slice(&self) -> &[f32] {
        assert!(!self.data.is_null());
        slice::from_raw_parts(self.data as *const f32, self.len as usize)
    }
    unsafe fn as_i32_slice(&self) -> &[i32] {
        assert!(!self.data.is_null());
        slice::from_raw_parts(self.data as *const i32, self.len as usize)
    }
    fn from_vec<T>(mut vec: Vec<T>) -> Array {
        // Important to make length and capacity match
        // A better solution is to track both length and capacity
        vec.shrink_to_fit();
        let array = Array {
            data: vec.as_ptr() as *const libc::c_void,
            len: vec.len() as libc::size_t,
        };
        // Leak the memory, and now the raw pointer is the owner
        mem::forget(vec);
        array
    }
}

#[no_mangle]
pub extern "C" fn convert_vec_c(lon: Array, lat: Array) -> Array {
    // we're receiving floats
    let lon = unsafe { lon.as_f32_slice() };
    let lat = unsafe { lat.as_f32_slice() };
    // copy values and combine
    let orig = lon.iter()
                  .cloned()
                  .zip(lat.iter()
                          .cloned());
    // carry out the conversion
    let result = orig.map(|elem| convert_bng(elem.0 as f64, elem.1 as f64));
    // convert back to vector of unsigned integer Tuples
    let nvec = result.map(|ints| {
                         IntTuple {
                             a: ints.0 as u32,
                             b: ints.1 as u32,
                         }
                     })
                     .collect();
    Array::from_vec(nvec)
}

Python中有两种管理资源的方法,这两种方法都涉及创建一个对象:

  • 具有终结器__del__方法
  • 或充当with语句的上下文管理器

这两者都涉及到拥有一个管理器对象来控制/提供对资源的访问,当不再需要该对象时,该对象将运行任何必要的清理代码。对于这种情况,我认为第一种方法效果最好,但我将同时演示这两种方法。

对于我的示例,我将使用这个Rust代码,其中Data是任何需要管理的资源(例如Array类型)的替代:

// ffi_example.rs
#![crate_type = "dylib"]
pub struct Data {
    x: i32
}
#[no_mangle]
pub extern fn data_create(x: i32) -> *mut Data {
    println!("Rust: creating: x = {}", x);
    Box::into_raw(Box::new(Data { x: x }))
}
// example function for interacting with the pointer
#[no_mangle]
pub unsafe extern fn data_get(p: *mut Data) -> i32 {
    (*p).x
}
#[no_mangle]
pub unsafe extern fn data_destroy(p: *mut Data) {
    let data = Box::from_raw(p);
    println!("Rust: destroying: x = {}", data.x);
}

它可以用rustc ffi_example.rs编译来创建libffi_example.so(或类似的,取决于平台)。这是我在两种情况下使用的Python代码的开始(CDLL调用可能需要调整):

import sys
import ctypes as c
class RawData(c.Structure):
    pass
lib = c.CDLL('./libffi_example.so')
create = lib.data_create
create.argtypes = [c.c_int]
create.restype = c.POINTER(RawData)
get = lib.data_get
get.arg_types = [c.POINTER(RawData)]
get.restype = c.c_int
destroy = lib.data_destroy
destroy.argtypes = [c.POINTER(RawData)]
destroy.restype = None

(注意,通过指针接口,我不必告诉Python任何关于RawData内部的信息。)

例如,你可以通过在末尾添加以下内容来检查一切是否正常:

p = create(10)
print('Python: got %s (at 0x%x)' % (get(p), c.addressof(p.contents)))
sys.stdout.flush()
destroy(p)

它打印出类似的东西

Rust: creating: x = 10 (at 0x138b7c0)
Python: got 10 (at 0x138b7c0)
Rust: destroying: x = 10 (at 0x138b7c0)

flush是为了确保两种语言的print以正确的顺序出现,因为它们有不同的缓冲区。)

__del__

要使用__del__,只需创建一个Python对象(而不是ctypes.Structure),作为Rust对象的接口,如

class Data:
    def __init__(self, x):
         self._pointer = create(x)
    def get(self):
         return int(get(self._pointer))
    def __del__(self):
         destroy(self._pointer)

然后可以将其用作正常对象:

obj = Data(123)
print('Python: %s' % obj.get())
sys.stdout.flush()
obj2 = obj # two pointers to the same `Data`
obj = Data(456) # overwrite one
print('Python: %s, %s' % (obj.get(), obj2.get()))
sys.stdout.flush()
obj2 = None # just clear the second reference
print('Python: end')
sys.stdout.flush()

这将打印:

Rust: creating: x = 123 (at 0x28aa510)
Python: 123
Rust: creating: x = 456 (at 0x28aa6e0)
Python: 456, 123
Rust: destroying: x = 123 (at 0x28aa510)
Python: end
Rust: destroying: x = 456 (at 0x28aa6e0)

也就是说,Python可以判断对象何时不再具有任何引用(例如,当123的两个句柄objobj2都被覆盖时,或者当456的程序结束时)。

上下文管理器

如果资源的作用域很重(在这种情况下可能不是),那么使用上下文管理器可能是有意义的,它将允许以下内容:

print('Python: before')
sys.stdout.flush()
with Data(789) as obj:
    print('Python: %s' % obj.get())
    sys.stdout.flush()
# obj's internals destroyed here
print('Python: after')
sys.stdout.flush()

这有点容易出错,因为对象的句柄可以保留在with语句之外,所以它必须检查这一点,否则它可能会访问未分配的内存。例如,

with Data(1234) as obj:
    pass
# obj's internals destroyed here
print(obj.get()) # oops...

无论如何,实施:

class Data:
    def __init__(self, x):
        self._x = x
        self._valid = False
    def __enter__(self):
        self._pointer = create(self._x)
        self._valid = False
        return self
    def __exit__(self):
        assert self._valid
        destroy(self._pointer)
        self._valid = False
        return False
    def get(self):
        if not self._valid:
            raise ValueError('getting from a destroyed Data')
        return int(get(self._pointer))

上面的第一个例子给出了如下输出:

Python: before
Rust: creating: x = 789 (at 0x1650530)
Python: 789
Rust: destroying: x = 789 (at 0x1650530)
Python: after

第二个给出:

Rust: creating: x = 1234 (at 0x113d450)
Rust: destroying: x = 1234 (at 0x113d450)
Traceback (most recent call last):
  File "ffi.py", line 82, in <module>
    print(obj.get()) # oops...
  File "ffi.py", line 63, in get
    raise ValueError('getting from a destroyed Data')
ValueError: getting from a destroyed Data

这种方法的优点是使资源有效/分配的代码区域更清晰,有效地实现了Rust基于RAII/scope的资源管理的手动形式。

最新更新