为什么 scipy 的熵函数返回大于 1 的值?



Scipy的entropy函数使用底数e作为默认值来计算Kullback-Leibler散度。根据维基百科,当使用基数 e 时,KL 背离不能大于 1。我发现了scipy的熵函数返回大于 1 的值的情况。这让我感到困惑,因为我认为在使用底数 e 时,KL 散度应该始终返回小于 1 的值entropyscipy。使用底数 e 的 KL 散度能否返回大于 1 的值?

import numpy as np
from scipy.stats import entropy
a = np.array([9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 2.10191815e-311, 1.40899754e-289, 1.51339868e-268, 2.60462468e-248, 7.18266540e-229, 3.17376526e-210, 2.24704957e-192, 2.54917227e-175, 4.63376925e-159, 1.34964112e-143, 6.29869946e-129, 4.71012196e-115, 5.64367638e-102, 1.08352949e-089, 3.33325126e-078, 1.64302455e-067, 1.29768336e-057, 1.64226070e-048, 3.33015272e-040, 1.08201899e-032, 5.63318870e-026, 4.69918344e-020, 6.28115023e-015, 1.34525510e-010, 4.61656331e-007, 2.53852607e-004, 2.23662484e-002, 3.15757258e-001, 7.14269696e-001, 2.58892691e-001, 1.50357826e-002, 1.39920378e-004, 2.08633761e-007, 5.10231155e-011, 1.00634451e-008, 1.38362196e-005, 3.08403844e-003, 1.15059177e-001, 7.94326422e-001, 1.23289276e+000, 4.70067360e-001, 3.73297736e-002, 5.22437696e-004, 1.20375305e-006, 4.47599703e-010, 2.67171382e-014, 2.55648664e-019, 3.92010948e-025, 9.63196683e-032, 1.60547392e-031, 6.16530117e-025, 3.79363060e-019, 3.74033228e-014, 5.90944148e-010, 1.49663225e-006, 6.08796110e-004, 4.02164040e-002, 4.57222409e-001, 1.12643499e+000, 8.52135681e-001, 1.72454239e-001, 9.30417906e-002, 5.59827871e-001, 5.79409118e-001, 1.03846717e-001, 1.84731916e-001, 6.84387807e-001, 4.11946755e-001, 3.97315469e-002, 6.14184915e-004, 1.21694830e-004, 1.35704294e-002, 2.45540939e-001, 7.11873681e-001, 3.30697296e-001, 2.46154078e-002, 2.93583814e-004, 5.61055554e-007, 1.71802036e-010, 8.43867916e-015, 5.62831851e-013, 5.50196657e-009, 8.61798955e-006, 2.16293098e-003, 8.69817304e-002, 5.60482566e-001, 5.78688507e-001, 9.57362227e-002, 2.53779422e-003, 1.07792614e-005, 3.55072468e-007, 2.05375878e-004, 1.94355743e-002, 2.94709412e-001, 7.16043356e-001, 2.78761835e-001, 1.73890555e-002, 1.73807037e-004, 2.78360369e-007, 7.14325464e-011, 2.93720017e-015, 1.93517230e-020, 2.04293439e-026, 3.45571471e-033, 9.36634232e-041, 4.06771817e-049, 2.83061215e-058, 3.15615674e-068, 5.63878377e-079, 1.61421360e-090, 7.40432116e-103, 5.44199547e-116, 6.40884529e-130, 1.20934464e-144, 3.65652932e-160, 1.77148191e-176, 1.37515948e-193, 1.71048037e-211, 3.40903771e-230, 1.08866485e-249, 5.57064237e-270, 4.56735784e-291, 6.00030717e-313, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101, 9.99998036e-101])
b = np.array([9.99999817e-101, 9.99999817e-101, 9.99999817e-101, 2.02566915e-322, 2.73231995e-300, 5.85483902e-279, 2.01024120e-258, 1.10594045e-238, 9.74914464e-220, 1.37706286e-201, 3.11670319e-184, 1.13030207e-167, 6.56833902e-152, 6.11625453e-137, 9.12631132e-123, 2.18223452e-109, 8.36237906e-097, 5.13595090e-085, 5.05635411e-074, 7.98139279e-064, 2.02069458e-054, 8.21015317e-046, 5.35828633e-038, 5.62548802e-031, 9.52322163e-025, 2.60964877e-019, 1.16505569e-014, 8.56616541e-011, 1.05704453e-007, 2.26687775e-005, 9.07548482e-004, 7.96282263e-003, 2.20749139e-002, 3.77654771e-002, 5.24491758e-002, 6.25507868e-002, 7.44957339e-002, 8.74057468e-002, 1.02073565e-001, 1.03482505e-001, 1.09261327e-001, 1.23073367e-001, 1.33260738e-001, 1.39046461e-001, 1.44976334e-001, 1.54826712e-001, 1.74496808e-001, 1.75622540e-001, 1.63440624e-001, 1.62227897e-001, 1.79793732e-001, 1.82026981e-001, 1.66609581e-001, 1.72399134e-001, 1.76313401e-001, 1.64008410e-001, 1.70834750e-001, 1.83002901e-001, 1.82188663e-001, 1.89022758e-001, 1.82226467e-001, 1.78970338e-001, 1.95974878e-001, 1.87797374e-001, 1.72195942e-001, 1.76511688e-001, 1.70339690e-001, 1.52968741e-001, 1.47557605e-001, 1.57225724e-001, 1.77444887e-001, 1.83423757e-001, 1.60588180e-001, 1.47852361e-001, 1.57144943e-001, 1.68812156e-001, 1.63506780e-001, 1.59701993e-001, 1.60628547e-001, 1.63740728e-001, 1.47115457e-001, 1.32379715e-001, 1.38692629e-001, 1.47987920e-001, 1.55725553e-001, 1.66184610e-001, 1.64971068e-001, 1.48899021e-001, 1.55917828e-001, 1.55918198e-001, 1.43948638e-001, 1.35234049e-001, 1.37186391e-001, 1.41463859e-001, 1.44387015e-001, 1.40677339e-001, 1.27750817e-001, 1.22727262e-001, 1.19810008e-001, 1.25121868e-001, 1.24886485e-001, 1.14156582e-001, 1.10666686e-001, 1.10821705e-001, 1.17673864e-001, 1.20996113e-001, 1.18674409e-001, 1.13875992e-001, 1.16487609e-001, 1.09680922e-001, 1.07337190e-001, 1.06478860e-001, 9.37957831e-002, 8.14551863e-002, 7.22193123e-002, 7.53889470e-002, 8.79760270e-002, 8.48151926e-002, 8.13069349e-002, 7.84095385e-002, 7.01163440e-002, 5.98483166e-002, 4.87074799e-002, 4.74906662e-002, 5.01973950e-002, 5.46738067e-002, 5.16237693e-002, 4.77157911e-002, 4.92647403e-002, 4.88779426e-002, 4.60921374e-002, 3.88358487e-002, 3.01211045e-002, 3.13199935e-002, 3.88591162e-002, 4.06832674e-002, 3.22598028e-002, 2.59866556e-002, 2.02839105e-002, 1.77907500e-002, 1.97505586e-002, 1.78936020e-002, 1.57982342e-002, 1.35378364e-002, 1.68566539e-002, 2.35887107e-002, 2.09031256e-002, 1.47182334e-002, 1.15023796e-002, 1.19794379e-002, 1.59357951e-002, 1.65216245e-002, 1.28095419e-002, 7.93256798e-003, 9.02787274e-003, 1.32950519e-002, 1.21701689e-002, 1.00311338e-002, 8.11077276e-003, 7.43666328e-003, 8.16972981e-003, 8.99712316e-003, 8.43013952e-003, 7.72266277e-003, 6.30075858e-003, 3.65282089e-003, 3.88808826e-003, 5.11983598e-003, 5.91272475e-003, 5.58310228e-003, 6.41794718e-003, 3.46351408e-003, 3.08311471e-003, 4.15156628e-003, 4.08583372e-003, 3.32346757e-003, 1.12388832e-003, 1.06533512e-003, 1.74551868e-003, 6.26931368e-004, 4.54748858e-005, 8.72897755e-005, 6.72984809e-004, 8.36526148e-004, 1.66610811e-004, 5.31710843e-006, 2.71892069e-008, 2.22774894e-011, 2.92472230e-015, 6.15250123e-020, 2.07380284e-025, 1.12003580e-031, 9.69270572e-039, 1.34402341e-046, 2.98618922e-055, 1.06310582e-064, 6.06435130e-075, 5.54294910e-086, 8.11794515e-098, 1.90502210e-110]) 
print(entropy(a, b))
1.2780983143936264

使用底数 e 的 KL 背离没有上限。维基百科中的介绍段落可能有点误导:

在简单情况下,Kullback-Leibler 散度为 0 表示 我们可以期待两个不同的相似行为,如果不是相同的行为 分布,而库尔巴克-莱布勒散度为 1 表示 这两个分布的行为方式如此不同,以至于 给定第一个分布的期望接近于零。

正如评论中指出的,没有上限,高于 1 的值是有效的 KL 背离。

最新更新