显式定义变量vs访问数组



我正在实现具有自适应步长(RK45)的Runge-Kutta-Fehlberg方法。我用

在笔记本中定义并调用我的Butcher tableau
module FehlbergTableau
using StaticArrays
export A, B, CH, CT
A = @SVector [ 0 , 2/9 , 1/3 , 3/4 , 1 , 5/6 ]
B = @SMatrix [ 0          0         0        0        0
2/9        0         0        0        0
1/12       1/4       0        0        0
69/128    -243/128   135/64   0        0
-17/12      27/4     -27/5     16/15    0
65/432    -5/16      13/16    4/27     5/144 ]
CH = @SVector [ 47/450 , 0 , 12/25 , 32/225 , 1/30 , 6/25 ]
CT = @SVector [ -1/150 , 0 , 3/100 , -16/75 , -1/20 , 6/25 ]
end
using .FehlbergTableau
如果我直接将RK45的算法编码为
function infinitesimal_flow(A::SVector{6,Float64}, B::SMatrix{6,5,Float64}, CH::SVector{6,Float64}, CT::SVector{6,Float64}, t0::Float64, Δt::Float64, J∇H::Function, x0::SVector{N,Float64}) where N

k1 = Δt * J∇H( t0 + Δt*A[1], x0 ) 
k2 = Δt * J∇H( t0 + Δt*A[2], x0 + B[2,1]*k1 ) 
k3 = Δt * J∇H( t0 + Δt*A[3], x0 + B[3,1]*k1 + B[3,2]*k2 )
k4 = Δt * J∇H( t0 + Δt*A[4], x0 + B[4,1]*k1 + B[4,2]*k2 + B[4,3]*k3 )
k5 = Δt * J∇H( t0 + Δt*A[5], x0 + B[5,1]*k1 + B[5,2]*k2 + B[5,3]*k3 + B[5,4]*k4 )
k6 = Δt * J∇H( t0 + Δt*A[6], x0 + B[6,1]*k1 + B[6,2]*k2 + B[6,3]*k3 + B[6,4]*k4 + B[6,5]*k5 )

TE = CT[1]*k1 + CT[2]*k2 + CT[3]*k3 + CT[4]*k4 + CT[5]*k5 + CT[6]*k6  
xt = x0 + CH[1]*k1 + CH[2]*k2 + CH[3]*k3 + CH[4]*k4 + CH[5]*k5 + CH[6]*k6 

norm(TE), xt
end                  
并将其与更紧凑的实现 进行比较
function infinitesimal_flow_2(A::SVector{6,Float64}, B::SMatrix{6,5,Float64}, CH::SVector{6,Float64}, CT::SVector{6,Float64}, t0::Float64,Δt::Float64,J∇H::Function, x0::SVector{N,Float64}) where N

k = MMatrix{N,6}(0.0I)
TE = zero(x0); xt = x0

for i=1:6
# EDIT: this is wrong! there should be a new variable here, as pointed 
# out by Lutz Lehmann: xs = x0
for j=1:i-1
# xs += B[i,j] * k[:,j]
x0 += B[i,j] * k[:,j] #wrong
end
k[:,i] = Δt *  J∇H(t0 + Δt*A[i], x0) 
TE += CT[i]*k[:,i]
xt += CH[i]*k[:,i]B[i,j] * k[:,j]
end
norm(TE), xt
end

那么第一个函数,它显式地定义变量,要快得多:

J∇H(t::Float64, X::SVector{N,Float64}) where N = @SVector [ -X[2]^2, X[1] ]
x0 = SVector{2}([0.0, 1.0])
infinitesimal_flow(A, B, CH, CT, 0.0, 1e-2, J∇H, x0)
infinitesimal_flow_2(A, B, CH, CT, 0.0, 1e-2, J∇H, x0)
@btime infinitesimal_flow($A, $B, $CH, $CT, 0.0, 1e-2, $J∇H, $x0)
>> 19.387 ns (0 allocations: 0 bytes)
@btime infinitesimal_flow_2($A, $B, $CH, $CT, 0.0, 1e-2, $J∇H, $x0)
>> 50.985 ns (0 allocations: 0 bytes)

我找不到类型不稳定或任何东西来证明延迟,对于更复杂的场景,我必须以循环形式使用算法。我做错了什么?

:infinitesimal_flow_2中的瓶颈是k[:,i] = Δt * J∇H(t0 + Δt*A[i], x0)

RK方法的每个阶段直接从RK步的基点计算其求值点。这在第一个方法中是显式的。在第二种方法中,您必须在每个阶段重置点计算,例如

for i=1:6
xs = x0
for j=1:i-1
xs += B[i,j] * k[:,j]
end
k[:,i] = Δt *  J∇H(t0 + Δt*A[i], xs) 
...

步长计算中最微小的错误都可能灾难性地脱离步长控制器,迫使步长降至零,从而使工作量急剧增加。RKF45

中的4101错误就是一个例子。

相关内容

  • 没有找到相关文章

最新更新