Rust-Wasm-对输入文件进行迭代



我正试图访问通过输入字段上传的文件内容的迭代器。

我可以通过websys将JS文件传递到Wasm中,但我一辈子都不知道如何访问Rust中传递文件的长度和名称以外的任何内容。

我想我可以将整个文件作为ByteArray传递到Wasm中,并对其进行迭代,但最好我想直接对文件内容进行迭代,而不进行复制,因为文件本身会很大(~1 GB(。

我在Mozilla JS文档中发现,我应该能够访问底层文件blob,通过.stream()方法从中获得ReadableStream,并从中获得可以迭代的Reader。但在websys中,ReadableStream的.getReader()方法返回一个简单的JSValue,我无法使用它做任何有用的事情。

我是在这里遗漏了什么,还是这个功能只是在web系统中遗漏了,或者有其他方法可以做到这一点?也许可以在JS中创建Iterator并将其传递给Rust?

我设法使用read_as_binary_string获得了一个工作示例。

这是代码

lib.rs

use js_sys::JsString;
use std::cell::RefCell;
use std::rc::Rc;
use wasm_bindgen::prelude::*;
use wasm_bindgen::JsCast;
use web_sys::{console, Event, FileReader, HtmlInputElement};
#[wasm_bindgen(start)]
pub fn main_wasm() {
let my_file: Rc<RefCell<Vec<u8>>> = Rc::new(RefCell::new(Vec::new()));
set_file_reader(&my_file);
}
fn set_file_reader(file: &Rc<RefCell<Vec<u8>>>) {
let filereader = FileReader::new().unwrap().dyn_into::<FileReader>().unwrap();
let my_file = Rc::clone(&file);
let onload = Closure::wrap(Box::new(move |event: Event| {
let element = event.target().unwrap().dyn_into::<FileReader>().unwrap();
let data = element.result().unwrap();
let file_string: JsString = data.dyn_into::<JsString>().unwrap();
let file_vec: Vec<u8> = file_string.iter().map(|x| x as u8).collect();
*my_file.borrow_mut() = file_vec;
console::log_1(&format!("file loaded: {:?}", file_string).into());
}) as Box<dyn FnMut(_)>);
filereader.set_onloadend(Some(onload.as_ref().unchecked_ref()));
onload.forget();
let fileinput: HtmlInputElement = web_sys::window()
.unwrap()
.document()
.expect("should have a document.")
.create_element("input")
.unwrap()
.dyn_into::<HtmlInputElement>()
.unwrap();
fileinput.set_id("file-upload");
fileinput.set_type("file");
web_sys::window()
.unwrap()
.document()
.unwrap()
.body()
.expect("document should have a body")
.append_child(&fileinput)
.unwrap();
let callback = Closure::wrap(Box::new(move |event: Event| {
let element = event
.target()
.unwrap()
.dyn_into::<HtmlInputElement>()
.unwrap();
let filelist = element.files().unwrap();
let _file = filelist.get(0).expect("should have a file handle.");
filereader.read_as_binary_string(&_file).unwrap();
}) as Box<dyn FnMut(_)>);
fileinput
.add_event_listener_with_callback("change", callback.as_ref().unchecked_ref())
.unwrap();
callback.forget();
}

index.html

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
</head>
<body>
<noscript
>This page contains webassembly and javascript content, please enable
javascript in your browser.</noscript
>
<script src="./stack.js"></script>
<script>
wasm_bindgen("./stack_bg.wasm");
</script>
</body>
</html>

Cargo.toml

[package]
name = "stack"
version = "0.1.0"
authors = [""]
edition = "2018"

[lib]
crate-type = ["cdylib", "rlib"]
[dependencies]
js-sys = "0.3.55"
wee_alloc = { version = "0.4.2", optional = true }

[dependencies.web-sys]
version = "0.3.4"
features = [
'Document',
'Window',
'console',
'Event',
'FileReader',
'File',
'FileList',
'HtmlInputElement']
[dev-dependencies]
wasm-bindgen-test = "0.2"
[dependencies.wasm-bindgen]
version = "0.2.70"
[profile.release]
# Tell `rustc` to optimize for small code size.
opt-level = "s"
debug = false

您可以在此处查看工作示例:http://rustwasmfileinput.glitch.me/

您最好的选择是使用wasm_streams机箱,它将您从.stream()方法获得的Web流API(如ReadableStream(桥接到Rust异步流API。

官方示例使用Fetch API作为源代码,但这一部分也与您的File用例相关:https://github.com/MattiasBuelens/wasm-streams/blob/f6dacf58a8826dc67923ab4a3bae87635690ca64/examples/fetch_as_stream.rs#L25-L33

let body = ReadableStream::from_raw(raw_body.dyn_into().unwrap_throw());
// Convert the JS ReadableStream to a Rust stream
let mut stream = body.into_stream();
// Consume the stream, logging each individual chunk
while let Some(Ok(chunk)) = stream.next().await {
console::log_1(&chunk);
}

我认为您可以使用FileReader执行类似的操作。

这里有一个例子,我记录了一个文件的文本内容:

use wasm_bindgen::prelude::*;
use web_sys::{Event, FileReader, HtmlInputElement};
use wasm_bindgen::JsCast;
#[wasm_bindgen]
extern "C" {
#[wasm_bindgen(js_namespace = console)]
fn log(s: &str);
}
#[wasm_bindgen(start)]
pub fn main() -> Result<(), JsValue> {
let window = web_sys::window().expect("no global `window` exists");
let document = window.document().expect("should have a document on window");
let body = document.body().expect("document should have a body");
let filereader = FileReader::new().unwrap().dyn_into::<FileReader>()?;
let closure = Closure::wrap(Box::new(move |event: Event| {
let element = event.target().unwrap().dyn_into::<FileReader>().unwrap();
let data = element.result().unwrap();
let js_data = js_sys::Uint8Array::from(data);
let rust_str: String = js_data.to_string().into();
log(rust_str.as_str());
}) as Box<dyn FnMut(_)>);

filereader.set_onloadend(Some(closure.as_ref().unchecked_ref()));
closure.forget();
let fileinput: HtmlInputElement = document.create_element("input").unwrap().dyn_into::<HtmlInputElement>()?;
fileinput.set_type("file");
let closure = Closure::wrap(Box::new(move |event: Event| {
let element = event.target().unwrap().dyn_into::<HtmlInputElement>().unwrap();
let filelist = element.files().unwrap();
let file = filelist.get(0).unwrap();
filereader.read_as_text(&file).unwrap();
//log(filelist.length().to_string().as_str());
}) as Box<dyn FnMut(_)>);
fileinput.add_event_listener_with_callback("change", closure.as_ref().unchecked_ref())?;
closure.forget();
body.append_child(&fileinput)?;
Ok(())
}

以及HTML:

<html>
<head>
<meta content="text/html;charset=utf-8" http-equiv="Content-Type"/>
</head>
<body>
<script type="module">
import init from './pkg/without_a_bundler.js';
async function run() {
await init();
}
run();
</script>
</body>
</html>

Cargo.toml

[package]
name = "without-a-bundler"
version = "0.1.0"
authors = [""]
edition = "2018"
[lib]
crate-type = ["cdylib"]
[dependencies]
js-sys = "0.3.51"
wasm-bindgen = "0.2.74"
[dependencies.web-sys]
version = "0.3.4"
features = [
'Blob',
'BlobEvent',
'Document',
'Element',
'Event',
'File',
'FileList',
'FileReader',
'HtmlElement',
'HtmlInputElement',
'Node',
'ReadableStream',
'Window',
]

然而,我不知道如何使用ReadableStreamget_reader(),因为根据链接的文档,它应该返回ReadableStreamDefaultReaderReadableStreamBYOBReader。虽然后者是实验性的,因此我认为它不存在于web-sys中是可以理解的,但我不知道为什么ReadableStreamDefaultReader也不存在。

您应该使用ReadableStreamDefaultReader::new((。

let stream: ReadableStream = response.body().unwrap();
let reader = ReadableStreamDefaultReader::new(&stream)?;

然后,您可以使用与JS中相同的方式使用ReadableStreamDefaultReader.read()

您还需要用于反序列化的结构:

#[derive(serde::Serialize, serde::Deserialize)]
struct ReadableStreamDefaultReadResult<T> {
pub value: T,
pub done: bool,
}

以下是用法示例:

loop {
let reader_promise = JsFuture::from(reader.read());
let result = reader_promise.await?;
let result: ReadableStreamDefaultReadResult<Option<Vec<u8>>> =
serde_wasm_bindgen::from_value(result).unwrap();
if result.done {
break;
}
// here you can read chunk of bytes from `result.value`
}

相关内容

  • 没有找到相关文章

最新更新