我的用例涉及两个可变向量a
和b
以及usize
参数x
。我想做以下修改:
- 将元素
b[0..x]
附加到a
的末尾(根据需要改变a
的容量) - 将
b
转换为b[x..]
,不改变b
的原始容量
目前我做的是:
while a.len() < x && !b.is_empty() {
a.push(b.pop_front().unwrap());
// here `b` is a VecDeque but I am happy to use a Vec if pop_front was not required
}
显然,这似乎是一个非常缓慢的操作,检查两个条件并在每次迭代中调用unwrap
。如果有这样的rev_split_off
操作就太好了:
let mut front = b.rev_split_off(x);
a.append(&mut front);
这里,rev_split_off
返回一个新分配给片b[0..x]
的向量,并将b
转换为容量不变的剩余片。
:如何有效地执行我的用例,使用或不使用rev_split_off
这样的东西?
嗯,我认为你必须自己实现rev_split_off
(尽管我可能会称之为split_off_back
,但它是一样的)。
我将如何实现它:
/// Moves the `i` first elements of `vec` at the end of `buffer`.
fn split_off_back<T>(vec: &mut Vec<T>, i: usize, buffer: &mut Vec<T>) {
// We have to make sure vec has enough elements.
// You could make the function unsafe and ask the caller to ensure
// this condition.
assert!(vec.len() >= i);
// Reserve enough memory in the target buffer
buffer.reserve(i);
// Now we know `buffer.capacity() >= buffer.len() + i`.
unsafe {
// SAFETY:
// * `vec` and `buffer` are two distinct vectors (they come from mutable references)
// so their allocations cannot overlap.
// * `vec` is valid for reads because we have an exclusive reference to it and we
// checked the value of `i`.
// * `buffer` is valid for writes because we ensured we had enough memory to store
// `i` additional elements.
std::ptr::copy_nonoverlapping(vec.as_ptr(), buffer.as_mut_ptr().add(buffer.len()), i);
// Now the memory is moved.
// we are not allowed to use it again from the `vec` vector.
// We just extanded `buffer`, we need to update its length.
// SAFEFY:
// * We ensured that the new length is less than the capacity (with `Vec::reserved`)
// * The vector is initialized for this new length (we moved the values).
buffer.set_len(buffer.len() + i);
// Now we need to update the first vector. The values from index `i` to its end
// need to be moved at the begining of the vector.
// SAFETY:
// * We have an exclusive reference to the vector. It is both valid for reads and writes.
std::ptr::copy(vec.as_ptr().add(i), vec.as_mut_ptr(), i);
// And update the length of `vec`.
// SAFETY: This subtraction is safe because we previously checked that `vec.len() >= i`.
vec.set_len(vec.len() - i);
}
}
注意,我将buffer
放在函数的参数中,以避免分配向量。如果您想要与split_off
相同的语义,您可以执行以下操作:
fn split_of_back<T>(vec: &mut Vec<T>, i: usize) -> Vec<T> {
assert!(vec.len() >= i);
let mut buffer = Vec::with_capacity(i);
unsafe { /* same thing */ }
buffer
}
使用drain
和extend
是非常简单的。
a.extend(b.drain(..x));
如果您的值是Copy
,那么您可以使用as_slice
获得最佳速度。在iuc中,由于专门化的原因,使用extend_from_slice
应该是可选的,但它有助于清晰。
a.extend_from_slice(b.drain(..x).as_slice());
我添加了一些基准测试,以显示unsafe
版本的代码明显更快。
#![feature(test)]
extern crate test;
#[cfg(test)]
mod tests {
extern crate test;
use test::{black_box, Bencher};
/// Moves the `i` first elements of `vec` at the end of `buffer`.
fn split_off_back<T>(vec: &mut Vec<T>, i: usize, buffer: &mut Vec<T>) {
assert!(vec.len() >= i);
buffer.reserve(i);
unsafe {
std::ptr::copy_nonoverlapping(vec.as_ptr(), buffer.as_mut_ptr().add(buffer.len()), i);
buffer.set_len(buffer.len() + i);
std::ptr::copy(vec.as_ptr().add(i), vec.as_mut_ptr(), i);
vec.set_len(vec.len() - i);
}
}
fn split_off_back_two<T>(vec: &mut Vec<T>, i: usize, buffer: &mut Vec<T>) {
buffer.extend(vec.drain(..i));
}
const VEC_SIZE: usize = 100000;
const SPLIT_POINT: usize = 20000;
#[bench]
fn run_v1(b: &mut Bencher) {
b.iter(|| {
let mut a = black_box(vec![0; VEC_SIZE]);
let mut b = black_box(vec![0; VEC_SIZE]);
split_off_back(&mut a, SPLIT_POINT, &mut b);
});
}
#[bench]
fn run_v2(b: &mut Bencher) {
b.iter(|| {
let mut a = black_box(vec![0; VEC_SIZE]);
let mut b = black_box(vec![0; VEC_SIZE]);
split_off_back_two(&mut a, SPLIT_POINT, &mut b);
});
}
}
这是cargo bench
在我的机器上的输出:
running 2 tests
test tests::run_v1 ... bench: 98,863 ns/iter (+/- 2,058)
test tests::run_v2 ... bench: 230,665 ns/iter (+/- 6,093)
test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured; 0 filtered out; finished in 0.48s