在 KDE 中的自定义点处的 PDF



我正在使用基数R中的density函数为给定的数据向量(1-D)生成KDE。density函数的参数n给出n均匀间隔点的概率密度估计。有没有办法在自定义点列表中获得此估计值?

我想我想要每 0.01 个百分位点的密度估计值,以便点在密度高的地方更近,在没有密度的地方更远,基本上将我的 PDF 估计值与该点 PDF 函数的可能置信度对齐。此x,y集合将存储起来,稍后用于模型开发后的评分。

熟悉Python的人会认识到此功能在scipy.stats.gaussian_kde.evaluate(..)中可用。

我想我终于明白你的意思了:

显然,您可以使用sm包来做到这一点:

library(sm)
a <- rnorm(200)
sm.density(a, eval.points = a)$estimate #the eval.points argument is the key argument you are looking for

输出:

> sm.density(a, eval.points = a)$estimate
  [1] 0.12772710 0.02405005 0.21971466 0.34392609 0.39495931 0.41543305 0.21263921 0.41537832 0.32914302 0.25565207
 [11] 0.35121705 0.27957087 0.19930803 0.41556843 0.26412647 0.32067182 0.36109746 0.33580489 0.01896655 0.41557119
 [21] 0.10733984 0.30202465 0.39557093 0.10097724 0.13841591 0.34892004 0.38626383 0.07735814 0.04421804 0.39630396
 [31] 0.38700142 0.10177375 0.19136592 0.23634829 0.24060493 0.37283049 0.38447048 0.09277430 0.38300854 0.38747915
 [41] 0.03857675 0.32614202 0.41553740 0.41109807 0.31061776 0.39805191 0.20964930 0.37428245 0.38470874 0.23212350
 [51] 0.37653126 0.06947437 0.39515910 0.40319273 0.41271155 0.24758345 0.40112930 0.41331974 0.29566411 0.39992320
 [61] 0.36686191 0.38990556 0.36492636 0.41281621 0.39267835 0.18448714 0.11787245 0.37712505 0.38775265 0.25030009
 [71] 0.41481836 0.10236957 0.39425025 0.03873721 0.08168519 0.29775494 0.34794457 0.16554033 0.36764219 0.41370926
 [81] 0.39960951 0.41306470 0.11107980 0.27943190 0.41510756 0.35634826 0.36718828 0.38085515 0.15645417 0.25692344
 [91] 0.11179099 0.22799955 0.39206820 0.41408224 0.29348350 0.15890729 0.22721980 0.38384978 0.31640118 0.03881538
[101] 0.41171143 0.41045637 0.38914218 0.40399988 0.38556505 0.27724666 0.15457874 0.36044473 0.21351522 0.37943612
[111] 0.41361048 0.40028703 0.34229100 0.40435532 0.07341782 0.34523757 0.36937555 0.26855928 0.26296213 0.40373905
[121] 0.36823187 0.19218498 0.06875183 0.38383405 0.39380643 0.09261450 0.35676087 0.41512915 0.11002953 0.22801342
[131] 0.12433048 0.13365228 0.35556910 0.37120609 0.33465014 0.41476827 0.30158998 0.41148426 0.40998579 0.29686716
[141] 0.01547056 0.41461764 0.09698607 0.32942869 0.41462633 0.29495019 0.26229083 0.41170128 0.37282610 0.40987606
[151] 0.39528089 0.33079101 0.33618617 0.41054245 0.34696030 0.32505169 0.40190879 0.23373421 0.41092030 0.21069149
[161] 0.41554976 0.37161607 0.09587529 0.23982159 0.40924851 0.28586226 0.04599452 0.41419171 0.34564851 0.37681629
[171] 0.36324057 0.17955626 0.11764356 0.29102065 0.17518755 0.01631140 0.37341812 0.23681565 0.30461539 0.31454744
[181] 0.41112586 0.22881959 0.14398296 0.41454269 0.38818158 0.36846550 0.10876282 0.25267048 0.39286846 0.29270928
[191] 0.14545077 0.34880880 0.40217248 0.32896962 0.41555177 0.33089562 0.41273214 0.08808706 0.39433817 0.06765712

因此,如果您想要 a 的 100 个分位数(百分位数):

a_quant <- quantile(a, 1:100/100)
sm.density(a_quant, eval.points = a_quant)$estimate

输出:

> sm.density(a_quant, eval.points = a_quant)$estimate
  [1] 0.01582788 0.06846862 0.08129320 0.10110568 0.11177916 0.11901171 0.12623830 0.14776146 0.15867243 0.16385618
 [11] 0.18902197 0.21554781 0.23069361 0.23686379 0.24233046 0.25667392 0.26330739 0.28597006 0.29260817 0.29466226
 [21] 0.29658942 0.30270500 0.31182163 0.31755901 0.32583914 0.32794022 0.33748016 0.34198509 0.34471986 0.34754376
 [31] 0.35899038 0.36480382 0.36772103 0.37260060 0.37961999 0.38321539 0.38542390 0.38803056 0.39019521 0.39104517
 [41] 0.39212982 0.39676851 0.39976587 0.40359531 0.40668593 0.40756583 0.40814699 0.40892928 0.40930386 0.40967425
 [51] 0.40999673 0.41007634 0.41024728 0.41040236 0.41045483 0.41036411 0.41021185 0.40643714 0.40443569 0.40383346
 [61] 0.40081718 0.39546541 0.39325763 0.39005222 0.38777023 0.38381332 0.37944776 0.37801648 0.37636178 0.37169064
 [71] 0.36890828 0.36603411 0.36374106 0.36101178 0.35648378 0.35354726 0.34896147 0.33874447 0.33061414 0.32493625
 [81] 0.29767821 0.28032682 0.27688973 0.26318047 0.25014308 0.23976716 0.23292234 0.21841618 0.21472566 0.19722509
 [91] 0.16976853 0.13176471 0.11215461 0.10400802 0.09752039 0.08865138 0.07374142 0.04617218 0.04121152 0.02798761
> length(sm.density(a_quant, eval.points = a_quant)$estimate)
[1] 100

通过这种方式,您可以获得所需的东西。

对不起,一开始我没有意识到你的要求是什么。

希望这有帮助!

最新更新