如何使用SciPy最小化重用Jacobian和Inverse Hessian



我目前正在使用SciPy.optimize中的最小化(BFGS(函数来校准我的模型。一旦校准,参数将不时受到轻微干扰,我想对我的初始ste(对jac和hessian的一种初始猜测,然后切换到对这些的常规计算(重复使用先前校准中的Jacobian和IHessian。

知道怎么做吗?

要做到这一点,您必须稍微修改scipy。BFGS的Hessian近似存储在变量Hk中,该变量首先在文件_optimize.py中函数_minimize_BFGS((的第1308行中定义:https://github.com/scipy/scipy/blob/65d8361519e3f1166d81dceef9f1d6375b4ed808/scipy/optimize/_optimize.py#L1308

您将不得不修改scipy的代码返回或存储此矩阵,并在下次将其作为参数再次传递给函数。


或者,您可以使用scipy的非线性根查找,因为最小化函数与查找点x相同,其中函数的梯度g(x(为零。需要注意的是,从算法的角度来看,这是非常不健壮的。寻根器可以收敛到鞍点或最大值等。

用于优化的BFGS算法与Broyden的用于寻找非线性函数根的方法非常相似。Broyden是BFGS中的B,BFGS基本上是应用于方程g(x(=0的Broydens寻根方法的修改版本。

从代码的角度来看,scipy提供了比BFGS更好的对Broyden的非线性根查找器内部的访问。Broyden-Jacobian矩阵(它是BFGS Hessian的类似物(被存储为类型为"的对象;Jacobian";如果您重写";setup((";方法不执行任何操作,因此不会重置。例如:

import numpy as np
from scipy.optimize import nonlin, rosen, rosen_der

x0 = np.array([0.0, 0.0])
counter = [1]
def callback(x, Fx):
print('k=', counter[0], '   xk=', x, '   f(xk)=', Fx)
counter[0] += 1
print('first solve:')
jac = nonlin.BroydenFirst()
sol1 = nonlin.nonlin_solve(rosen_der, x0, jacobian=jac, callback=callback, line_search='wolfe')
print('')
print('Jacobian after first solve:')
print(jac.todense())

print('')
print('second solve:')
jac.setup = lambda x,y,z: None # Prevent Jacobian from resetting. Try commenting this line out!
counter = [1]
sol2 = nonlin.nonlin_solve(rosen_der, x0, jacobian=jac, callback=callback, line_search='wolfe')
print('')
print('Jacobian after second solve:')
print(jac.todense())

这会产生以下输出:

first solve:
k= 1    xk= [-0.5  0. ]    f(xk)= [-53. -50.]
k= 2    xk= [-0.32283933  0.16713271]    f(xk)= [ 5.47792447 12.58149553]
k= 3    xk= [-0.28061925  0.08236319]    f(xk)= [-2.1553475  0.7232059]
k= 4    xk= [-0.21641127  0.02839649]    f(xk)= [-4.02884252 -3.68746961]
k= 5    xk= [-0.18109429  0.02229409]    f(xk)= [-3.12286088 -2.10021066]
k= 6    xk= [-0.19320298  0.02730297]    f(xk)= [-3.16110526 -2.00488442]
k= 7    xk= [-0.1500595   0.01281555]    f(xk)= [-2.88248831 -1.94046136]
k= 8    xk= [-0.14063472  0.00987684]    f(xk)= [-2.83825517 -1.98025685]
k= 9    xk= [-0.10559036  0.00122285]    f(xk)= [-2.63043649 -1.98529385]
k= 10    xk= [-0.07978769 -0.00399875]    f(xk)= [-2.49036939 -2.07296403]
k= 11    xk= [-0.04033837 -0.00840453]    f(xk)= [-2.24254189 -2.00634254]
k= 12    xk= [-0.01281811 -0.01000148]    f(xk)= [-2.07775868 -2.03315754]
k= 13    xk= [ 0.04612425 -0.00598848]    f(xk)= [-1.75801514 -1.62318494]
k= 14    xk= [ 0.03215952 -0.00653348]    f(xk)= [-1.83833134 -1.51354257]
k= 15    xk= [0.08036797 0.0021266 ]    f(xk)= [-1.6999891  -0.86648289]
k= 16    xk= [-0.04367326 -0.01098396]    f(xk)= [-2.31254872 -2.57826183]
k= 17    xk= [-0.07820406 -0.00365169]    f(xk)= [-2.46195331 -1.95351227]
k= 18    xk= [-0.02940764 -0.00337684]    f(xk)= [-2.10871    -0.84832937]
k= 19    xk= [-0.0314977  -0.00316138]    f(xk)= [-2.11532548 -0.83069659]
k= 20    xk= [-0.00053675 -0.00472907]    f(xk)= [-2.00208887 -0.94587179]
k= 21    xk= [ 0.01086898 -0.00493684]    f(xk)= [-1.95628507 -1.01099561]
k= 22    xk= [ 0.03866879 -0.00408101]    f(xk)= [-1.83641115 -1.11525685]
k= 23    xk= [ 0.06232922 -0.00227391]    f(xk)= [-1.72179137 -1.23176741]
k= 24    xk= [0.09084425 0.0019575 ]    f(xk)= [-1.5895594 -1.2590346]
k= 25    xk= [0.11524109 0.00670698]    f(xk)= [-1.46650132 -1.3147068 ]
k= 26    xk= [0.17879539 0.0246649 ]    f(xk)= [-1.12012011 -1.46057767]
k= 27    xk= [0.17505925 0.02929594]    f(xk)= [-1.5553635  -0.26996006]
k= 28    xk= [0.18893989 0.03342207]    f(xk)= [-1.4500936  -0.45524167]
k= 29    xk= [0.22546628 0.04688035]    f(xk)= [-1.19240743 -0.79093866]
k= 30    xk= [0.23877529 0.05259369]    f(xk)= [-1.10029946 -0.88399008]
k= 31    xk= [0.25428437 0.06010509]    f(xk)= [-1.02807924 -0.91109027]
k= 32    xk= [0.27116407 0.06871439]    f(xk)= [-0.9353492  -0.96311185]
k= 33    xk= [0.28827306 0.07809215]    f(xk)= [-0.84584567 -1.00184213]
k= 34    xk= [0.30453673 0.08760267]    f(xk)= [-0.76480491 -1.0279903 ]
k= 35    xk= [0.32072413 0.09764762]    f(xk)= [-0.68934851 -1.04326926]
k= 36    xk= [0.33639066 0.10790526]    f(xk)= [-0.6203388  -1.05068295]
k= 37    xk= [0.35201809 0.11866338]    f(xk)= [-0.55625337 -1.05067108]
k= 38    xk= [0.36724683 0.12964277]    f(xk)= [-0.49759833 -1.04549305]
k= 39    xk= [0.38246707 0.14110658]    f(xk)= [-0.44343805 -1.03489669]
k= 40    xk= [0.39728775 0.15273218]    f(xk)= [-0.39410312 -1.02107526]
k= 41    xk= [0.41215838 0.16485939]    f(xk)= [-0.34886931 -1.00302937]
k= 42    xk= [0.42650931 0.17699212]    f(xk)= [-0.30793999 -0.9836144 ]
k= 43    xk= [0.44109504 0.18976381]    f(xk)= [-0.27072811 -0.9602033 ]
k= 44    xk= [0.45477201 0.20212697]    f(xk)= [-0.23719208 -0.93812269]
k= 45    xk= [0.4681153  0.21458789]    f(xk)= [-0.21291526 -0.9088083 ]
k= 46    xk= [0.47908973 0.22505828]    f(xk)= [-0.1854592  -0.89373795]
k= 47    xk= [0.50398499 0.2497432 ]    f(xk)= [-0.13370942 -0.85153389]
k= 48    xk= [0.51439976 0.26044589]    f(xk)= [-0.11498655 -0.83224564]
k= 49    xk= [0.53852539 0.28603837]    f(xk)= [-0.06750659 -0.79424541]
k= 50    xk= [0.55785196 0.30748713]    f(xk)= [-0.05606785 -0.74233693]
k= 51    xk= [0.56092826 0.31093463]    f(xk)= [-0.0466508  -0.74117561]
k= 52    xk= [0.61261365 0.37369717]    f(xk)= [-0.38311202 -0.31966368]
k= 53    xk= [0.52465657 0.27398909]    f(xk)= [-0.68302112 -0.25508661]
k= 54    xk= [0.74386124 0.52529992]    f(xk)= [ 7.82778407 -5.60592561]
k= 55    xk= [0.54293777 0.30014958]    f(xk)= [-2.07995492  1.07363175]
k= 56    xk= [0.58511821 0.34990237]    f(xk)= [-2.59425851  1.50781064]
k= 57    xk= [0.35868093 0.08830817]    f(xk)= [ 4.50558888 -8.06876881]
k= 58    xk= [0.47983856 0.24586376]    f(xk)= [-4.0381075   3.12374291]
k= 59    xk= [0.3775732 0.1412489]    f(xk)= [-1.04660917 -0.26252449]
k= 60    xk= [0.36511619 0.13121294]    f(xk)= [-0.96352471 -0.41937734]
k= 61    xk= [ 0.1114212  -0.05351408]    f(xk)= [  1.16118684 -13.18575168]
k= 62    xk= [0.25437135 0.02356053]    f(xk)= [ 2.69511008 -8.22885014]
k= 63    xk= [0.16777908 0.02828527]    f(xk)= [-1.67353225  0.0270904 ]
k= 64    xk= [0.21546576 0.04636992]    f(xk)= [-1.56427885 -0.01111462]
k= 65    xk= [0.24867847 0.0590896 ]    f(xk)= [-1.22895955 -0.55027587]
k= 66    xk= [0.26241736 0.06557769]    f(xk)= [-1.13033017 -0.65703565]
k= 67    xk= [0.27594487 0.07265633]    f(xk)= [-1.06297502 -0.69784814]
k= 68    xk= [0.29626637 0.08383401]    f(xk)= [-0.94058127 -0.78794969]
k= 69    xk= [0.31324088 0.09384403]    f(xk)= [-0.83777351 -0.85516414]
k= 70    xk= [0.3305509  0.10477546]    f(xk)= [-0.74543613 -0.89768638]
k= 71    xk= [0.34571472 0.11489762]    f(xk)= [-0.66954478 -0.9242097 ]
k= 72    xk= [0.36107381 0.12568167]    f(xk)= [-0.60009902 -0.9385247 ]
k= 73    xk= [0.37526559 0.13609304]    f(xk)= [-0.53928208 -0.94624546]
k= 74    xk= [0.38966873 0.14710916]    f(xk)= [-0.48301011 -0.94651221]
k= 75    xk= [0.40322511 0.15787327]    f(xk)= [-0.43270865 -0.94344462]
k= 76    xk= [0.41704718 0.16925377]    f(xk)= [-0.3860966  -0.93491704]
k= 77    xk= [0.4300686  0.18033373]    f(xk)= [-0.34418922 -0.92505424]
k= 78    xk= [0.44351101 0.1921499 ]    f(xk)= [-0.30541382 -0.9104218 ]
k= 79    xk= [0.45596    0.20341705]    f(xk)= [-0.27054816 -0.89649514]
k= 80    xk= [0.46923896 0.2158001 ]    f(xk)= [-0.23845909 -0.87701904]
k= 81    xk= [0.48085841 0.22691624]    f(xk)= [-0.2095579  -0.86171446]
k= 82    xk= [0.49450418 0.24035192]    f(xk)= [-0.18369471 -0.83649135]
k= 83    xk= [0.50433095 0.25022809]    f(xk)= [-0.15987461 -0.82432328]
k= 84    xk= [0.52923466 0.27606143]    f(xk)= [-0.08884812 -0.80558078]
k= 85    xk= [0.54817807 0.29670844]    f(xk)= [-0.07244057 -0.75815082]
k= 86    xk= [0.55196258 0.30088075]    f(xk)= [-0.06107966 -0.7563875 ]
k= 87    xk= [0.58409242 0.3383646 ]    f(xk)= [-0.17778262 -0.55987076]
k= 88    xk= [0.57415074 0.32741847]    f(xk)= [-0.33941812 -0.44612013]
k= 89    xk= [0.5759845  0.32975938]    f(xk)= [-0.38752971 -0.39975147]
k= 90    xk= [0.54936923 0.30264256]    f(xk)= [-1.08497164  0.16720094]
k= 91    xk= [0.59588566 0.35541025]    f(xk)= [-0.88701248  0.06610646]
k= 92    xk= [0.62557699 0.38945577]    f(xk)= [-0.27571094 -0.37815896]
k= 93    xk= [0.62659512 0.39086428]    f(xk)= [-0.30639535 -0.35143459]
k= 94    xk= [0.60750698 0.37014954]    f(xk)= [-1.04859775  0.21696188]
k= 95    xk= [0.63679781 0.40648354]    f(xk)= [-0.97401477  0.19441837]
k= 96    xk= [0.67603801 0.4552325 ]    f(xk)= [-0.16255821 -0.35897816]
k= 97    xk= [0.67756175 0.457579  ]    f(xk)= [-0.23538087 -0.30218326]
k= 98    xk= [0.66719264 0.44635928]    f(xk)= [-0.98940486  0.24265117]
k= 99    xk= [0.68116519 0.46516345]    f(xk)= [-0.95848113  0.23548731]
k= 100    xk= [0.72613353 0.52567098]    f(xk)= [-0.08332021 -0.31978466]
k= 101    xk= [0.72763167 0.52819241]    f(xk)= [-0.17934169 -0.25108513]
k= 102    xk= [0.72088263 0.52080286]    f(xk)= [-0.88438909  0.22621877]
k= 103    xk= [0.72930705 0.53299447]    f(xk)= [-0.86394012  0.22113746]
k= 104    xk= [0.77143759 0.59395855]    f(xk)= [-0.09998023 -0.2314799 ]
k= 105    xk= [0.772421   0.59561435]    f(xk)= [-0.14005778 -0.20396923]
k= 106    xk= [0.7641847  0.58495593]    f(xk)= [-0.77048152  0.19553579]
k= 107    xk= [0.77477743 0.60123328]    f(xk)= [-0.74585811  0.19064376]
k= 108    xk= [0.81159937 0.65777517]    f(xk)= [-0.07866333 -0.1836731 ]
k= 109    xk= [0.81222326 0.65888019]    f(xk)= [-0.10705352 -0.16528704]
k= 110    xk= [0.80623427 0.65086575]    f(xk)= [-0.66231087  0.17040917]
k= 111    xk= [0.81361762 0.66280639]    f(xk)= [-0.64378248  0.16655104]
k= 112    xk= [0.84654107 0.71601533]    f(xk)= [-0.09817672 -0.12329062]
k= 113    xk= [0.85700521 0.73465556]    f(xk)= [-0.35373609  0.03952514]
k= 114    xk= [0.83758317 0.70213926]    f(xk)= [-0.52374073  0.1187387 ]
k= 115    xk= [0.90388038 0.81436146]    f(xk)= [ 0.76163845 -0.52765703]
k= 116    xk= [0.86767736 0.75453523]    f(xk)= [-0.84467781  0.33424436]
k= 117    xk= [0.90054052 0.81082101]    f(xk)= [-0.1440905  -0.03044198]
k= 118    xk= [0.90932277 0.82694243]    f(xk)= [-0.20846799  0.01490863]
k= 119    xk= [0.87018732 0.75667299]    f(xk)= [-0.06714585 -0.1105966 ]
k= 120    xk= [0.82135459 0.67121157]    f(xk)= [ 0.76362535 -0.68235826]
k= 121    xk= [0.86037593 0.74128628]    f(xk)= [-0.63700442  0.20790696]
k= 122    xk= [0.83873931 0.70465215]    f(xk)= [-0.71455427  0.23370366]
k= 123    xk= [1.03860171 1.04308697]    f(xk)= [14.86961426 -7.12131062]
k= 124    xk= [0.84811163 0.7215493 ]    f(xk)= [-1.06909728  0.45119093]
k= 125    xk= [0.86097865 0.74383034]    f(xk)= [-1.15489858  0.50922045]
k= 126    xk= [0.66388223 0.40558191]    f(xk)= [ 8.66399552 -7.03154158]
k= 127    xk= [0.83144109 0.69757602]    f(xk)= [-2.42627201  1.25634529]
k= 128    xk= [0.78711785 0.62687425]    f(xk)= [-2.73036387  1.46394824]
k= 129    xk= [1.21594817 1.31516318]    f(xk)= [ 79.89010983 -32.67335542]
k= 130    xk= [0.80183352 0.64738549]    f(xk)= [-1.82311642  0.88970056]
k= 131    xk= [0.81130929 0.66135898]    f(xk)= [-1.39515741  0.62724291]
k= 132    xk= [0.84379242 0.71145948]    f(xk)= [-0.13482093 -0.10523573]
k= 133    xk= [0.8488667  0.72108011]    f(xk)= [-0.47388572  0.1010872 ]
k= 134    xk= [0.83939571 0.70529745]    f(xk)= [-0.56036629  0.14245826]
k= 135    xk= [0.90713661 0.81941977]    f(xk)= [ 1.07593817 -0.69541067]
k= 136    xk= [0.86501385 0.74991553]    f(xk)= [-0.84661343  0.33331324]
k= 137    xk= [0.89854797 0.80733425]    f(xk)= [-0.18342061 -0.01084162]
k= 138    xk= [0.91010896 0.82821754]    f(xk)= [-0.15037671 -0.01615486]
k= 139    xk= [0.92061081 0.84725285]    f(xk)= [-0.05883515 -0.05428094]
k= 140    xk= [0.93145142 0.8677041 ]    f(xk)= [-0.17522727  0.02046812]
k= 141    xk= [0.90995165 0.82823094]    f(xk)= [-0.25978644  0.04378789]
k= 142    xk= [0.9791429  0.95553256]    f(xk)= [ 1.20698881 -0.63765106]
k= 143    xk= [0.92368943 0.85431748]    f(xk)= [-0.56470048  0.22306163]
k= 144    xk= [0.9419506 0.8881915]    f(xk)= [-0.46295033  0.18411344]
k= 145    xk= [0.9577381  0.91746347]    f(xk)= [-0.16160146  0.04023942]
k= 146    xk= [0.96701862 0.93506488]    f(xk)= [-0.04270618 -0.01202489]
k= 147    xk= [0.97141166 0.94372517]    f(xk)= [-0.0900311   0.01691066]
k= 148    xk= [0.96094308 0.92354637]    f(xk)= [-0.12991058  0.026951  ]
k= 149    xk= [0.99570993 0.99061839]    f(xk)= [ 0.3179641  -0.16397559]
k= 150    xk= [0.97199417 0.94517136]    f(xk)= [-0.21102069  0.07973764]
k= 151    xk= [0.98787082 0.97590466]    f(xk)= [-0.03054315  0.00318098]
k= 152    xk= [0.99093975 0.98195051]    f(xk)= [-0.01372648 -0.00221709]
k= 153    xk= [0.99398549 0.98801939]    f(xk)= [-0.016895    0.00244771]
k= 154    xk= [0.95542339 0.91169344]    f(xk)= [ 0.3466759  -0.22808167]
k= 155    xk= [0.99166933 0.98353597]    f(xk)= [-0.06740292  0.02558392]
k= 156    xk= [0.9849434  0.97039446]    f(xk)= [-0.14080314  0.05619102]
k= 157    xk= [0.99794854 0.99583192]    f(xk)= [ 0.02359149 -0.01387567]
k= 158    xk= [0.99621505 0.99248275]    f(xk)= [-0.02284649  0.00766731]
k= 159    xk= [0.99710996 0.99424387]    f(xk)= [-0.0119997   0.00311882]
k= 160    xk= [0.99816951 0.99634977]    f(xk)= [-0.00661416  0.0014793 ]
k= 161    xk= [0.99951767 0.99903483]    f(xk)= [-0.00066891 -0.00014795]
k= 162    xk= [0.99970257 0.99940648]    f(xk)= [-0.001095    0.00025015]
k= 163    xk= [0.99900072 0.99800724]    f(xk)= [-0.00391391  0.00095864]
k= 164    xk= [0.99997795 0.99995574]    f(xk)= [ 1.74913071e-05 -3.08012179e-05]
k= 165    xk= [0.99997761 0.99995513]    f(xk)= [-8.70938135e-06 -1.80379225e-05]
k= 166    xk= [0.99997777 0.99995565]    f(xk)= [-9.03605856e-05  2.29477018e-05]
k= 167    xk= [0.99997745 0.99995502]    f(xk)= [-9.16328188e-05  2.32707922e-05]
k= 168    xk= [1. 1.]    f(xk)= [ 2.02059708e-07 -1.00387831e-07]
Jacobian after first solve:
[[ 116.48889476  -56.35457991]
[-225.91025031  112.73024426]]
second solve:
k= 1    xk= [0.56260435 1.12745333]    f(xk)= [-183.36781519  162.18593492]
k= 2    xk= [1. 1.]    f(xk)= [ 9.83445546e-08 -5.00097297e-08]
Jacobian after second solve:
[[ 380.39702492 -133.25513499]
[-329.18184311  142.82269653]]

您可以看到,第一次需要168次迭代,而在第一次重新使用Jacobian时,第二次只需要2次迭代。

我发现使用";Wolfe条件";linesearch,并使用";Broyden的第一种方法";(而不是Broyden的第二种方法(往往会使求解器在这里更加稳健。

我认为应该可以使用装饰器。请参见下文。根据公差或是否第一次调用梯度函数,将重新计算梯度或从缓存返回梯度。

import numpy as np
from scipy.optimize import minimize
aa = 3.14
#decorator
def smartgrad(func):
def wrapped(*args,**kwargs):
distance = abs(args[0]-wrapped.oldstate)
if ( wrapped.firstcall == True) or (distance > wrapped.tolerance):
print('recomputing...')
newgrad = func(*args,**kwargs)
wrapped.firstcall = False
wrapped.oldgrad   = newgrad
wrapped.oldstate  = args[0]
return newgrad
else:
print('returning from cache...')
return wrapped.oldgrad

wrapped.firstcall = True
wrapped.tolerance = 1E-1
wrapped.oldgrad   = 0.0
wrapped.oldstate  = 0.0

return wrapped

def ff(xx):
return (xx-aa)**2
@smartgrad
def grad(xx):
return 2*(xx-aa)
if __name__ =='__main__':
gg=grad(1.0);       # gradient is recomputed 
print(gg)
gg=grad(1.05)       # gradient is returned from cache
print(gg)
gg=grad(1.11)       # gradient is recomputed
print(gg)

最新更新