我在测量Rust二进制包的运行时性能时遇到了一个奇怪的问题。板条箱是xdev模拟器。要运行它,你只需要克隆git的repo:
git clone https://github.com/iscar-ucm/xdevs.rs.git
cd xdevs.rs
cargo run --release HO 200 200
在安装MacOS Moterey和2.5 GHz四核英特尔酷睿i7处理器的MacBook Pro上,我得到以下输出:
Model creation time: 110.71127ms
Simulator creation time: 121ns
Simulation time: 991.374624ms
然而,当在安装Ubuntu 20.04和英特尔酷睿i9-10900X CPU @ 3.70GHz的工作站PC上尝试时,我得到了这个输出:
Model creation time: 61.938331ms
Simulator creation time: 137ns
Simulation time: 2.737863127s
这对我来说没有意义。为什么我用更强大的机器得到更差的结果?顺便说一下,我在编译时启用链接时间优化并使用target-cpu=native标志。
编辑:
我认为在每台机器中添加更多关于工具链等属性的信息将是一个好主意。
在Mac:
➜ xdevs.rs git:(main) rustc --version
rustc 1.66.0 (69f9c33d7 2022-12-12)
➜ xdevs.rs git:(main) rustc -C target-cpu=native --print cfg
debug_assertions
panic="unwind"
target_arch="x86_64"
target_endian="little"
target_env=""
target_family="unix"
target_feature="aes"
target_feature="avx"
target_feature="avx2"
target_feature="bmi1"
target_feature="bmi2"
target_feature="fma"
target_feature="fxsr"
target_feature="lzcnt"
target_feature="pclmulqdq"
target_feature="popcnt"
target_feature="rdrand"
target_feature="sse"
target_feature="sse2"
target_feature="sse3"
target_feature="sse4.1"
target_feature="sse4.2"
target_feature="ssse3"
target_feature="xsave"
target_feature="xsaveopt"
target_has_atomic="128"
target_has_atomic="16"
target_has_atomic="32"
target_has_atomic="64"
target_has_atomic="8"
target_has_atomic="ptr"
target_os="macos"
target_pointer_width="64"
target_vendor="apple"
unix
在Ubuntu:
rcardenas@celsius:~/xdevs.rs$ rustc --version
rustc 1.66.0 (69f9c33d7 2022-12-12)
rcardenas@celsius:~/xdevs.rs$ rustc -C target-cpu=native --print cfg
debug_assertions
panic="unwind"
target_arch="x86_64"
target_endian="little"
target_env="gnu"
target_family="unix"
target_feature="adx"
target_feature="aes"
target_feature="avx"
target_feature="avx2"
target_feature="bmi1"
target_feature="bmi2"
target_feature="fma"
target_feature="fxsr"
target_feature="lzcnt"
target_feature="pclmulqdq"
target_feature="popcnt"
target_feature="rdrand"
target_feature="rdseed"
target_feature="sse"
target_feature="sse2"
target_feature="sse3"
target_feature="sse4.1"
target_feature="sse4.2"
target_feature="ssse3"
target_feature="xsave"
target_feature="xsavec"
target_feature="xsaveopt"
target_feature="xsaves"
target_has_atomic="16"
target_has_atomic="32"
target_has_atomic="64"
target_has_atomic="8"
target_has_atomic="ptr"
target_os="linux"
target_pointer_width="64"
target_vendor="unknown"
unix
然而,我还是不明白为什么Ubuntu要比MacOs多花三倍的时间来模拟,而Ubuntu有更好的硬件规格。
编辑当我在Mac中以发布模式运行crate时,我想添加结果:
➜ xdevs.rs git:(main) RUSTFLAGS="-C target-cpu=native" cargo run --verbose --release HO 200 200
Compiling xdevs v0.1.1 (/Users/rcardenas/xdevs.rs)
Running `rustc --crate-name xdevs --edition=2021 src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C embed-bitcode=no -C metadata=8d925d6ed50d8948 -C extra-filename=-8d925d6ed50d8948 --out-dir /Users/rcardenas/xdevs.rs/target/release/deps -L dependency=/Users/rcardenas/xdevs.rs/target/release/deps -C target-cpu=native`
Running `rustc --crate-name xdevs --edition=2021 src/main.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type bin --emit=dep-info,link -C opt-level=3 -C embed-bitcode=no -C metadata=7179874cf29eab82 -C extra-filename=-7179874cf29eab82 --out-dir /Users/rcardenas/xdevs.rs/target/release/deps -L dependency=/Users/rcardenas/xdevs.rs/target/release/deps --extern xdevs=/Users/rcardenas/xdevs.rs/target/release/deps/libxdevs-8d925d6ed50d8948.rlib -C target-cpu=native`
Finished release [optimized] target(s) in 1.87s
Running `target/release/xdevs HO 200 200`
Model creation time: 99.400995ms
Simulator creation time: 66ns
Simulation time: 840.980883ms
在Ubuntu:
rcardenas@celsius:~/xdevs.rs$ RUSTFLAGS="-C target-cpu=native" cargo run --verbose --release HO 200 200
Compiling xdevs v0.1.1 (/home/rcardenas/xdevs.rs)
Running `rustc --crate-name xdevs --edition=2021 src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C embed-bitcode=no -C metadata=f8b9c1c6a690ac7c -C extra-filename=-f8b9c1c6a690ac7c --out-dir /home/rcardenas/xdevs.rs/target/release/deps -L dependency=/home/rcardenas/xdevs.rs/target/release/deps -C target-cpu=native`
Running `rustc --crate-name xdevs --edition=2021 src/main.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type bin --emit=dep-info,link -C opt-level=3 -C embed-bitcode=no -C metadata=1a0bd9ecd09448cd -C extra-filename=-1a0bd9ecd09448cd --out-dir /home/rcardenas/xdevs.rs/target/release/deps -L dependency=/home/rcardenas/xdevs.rs/target/release/deps --extern xdevs=/home/rcardenas/xdevs.rs/target/release/deps/libxdevs-f8b9c1c6a690ac7c.rlib -C target-cpu=native`
Finished release [optimized] target(s) in 0.96s
Running `target/release/xdevs HO 200 200`
Model creation time: 69.743935ms
Simulator creation time: 121ns
Simulation time: 2.782144931s
我猜可能两个平台的调试模式优化是不同的,你是否尝试了cargo --release
在发布模式下构建。
对不起,我没有注意到--release
,我试了我的笔记本(archlinux ryzen 4800U
):
[I] forks@archlinux /t/xdevs.rs (main)> target/release/xdevs HO 200 200
Model creation time: 67.439928ms
Simulator creation time: 1.396µs
Simulation time: 2.061134191s
在我的PC上(archlinux ryzen 1700x
)得到类似的结果:
*[main][/tmp/xdevs.rs]$ target/release/xdevs HO 200 200
Model creation time: 89.505405ms
Simulator creation time: 80ns
Simulation time: 2.039591205s
工具链吗?cpu日程安排吗?我试了试HO 200 2000
:
[I] forks@archlinux /t/xdevs.rs (main)> time target/release/xdevs HO 200 2000
Model creation time: 606.153956ms
Simulator creation time: 1.467µs
Simulation time: 21.079952693s
和
*[main][/tmp/xdevs.rs]$ target/release/xdevs HO 200 2000
Model creation time: 685.050164ms
Simulator creation time: 40ns
Simulation time: 21.746809784s
相同的结果