我需要几个有间隙的连续分布来适应一些数据,为此我正在对scipy.stats.rv_continuous
进行子类化。下面是一个有间隙的均匀分布的例子。s0
和l
之间以及h
和s1
之间的分布是平坦的。
from scipy.stats import *
class gapF_gen(rv_continuous):
''' Class for a flat distribution with a gap in it
s0, s1: bounds of support
l, h: gap
s0 < l < h < s1
'''
def _argcheck(self, s0, s1, l, h): return (s0 < l < h < s1)
def _get_support(self, s0, s1, l, h): return s0, s1
def _pdf(self, x, s0, s1, l, h):
if (s0 <= x <= l) or (h <= x <= s1): return 1 / (s1 - h + l - s0)
else: return 0
gapF = gapF_gen(name='gapF')
bf = gapF(s0=-2.6, s1=4.77, l=-1.3, h=3.5)
print(bf.pdf(-2.8)) # OK
print(bf.pdf([-23.8, 3.8, 2.6, 6.9, 77.9])) # Not OK
我定义了_pdf
来检查该值是否为零。当将标量值传递给自动生成的pdf
时,这是有效的,但当列表传递给pdf
时,由于范围检查,事情就不起作用了:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
另一方面,如果我将函数重命名为覆盖``pdf`,那么对于标量,我会得到错误:
TypeError: _parse_args() got an unexpected keyword argument 's0'
有什么建议可以解决这个问题吗?
错误只是不能在x是numpy数组的情况下执行s0 <= x <= l0
(试试吧!(。相反,请使用按位和:(s0 <= x) & (x <= l0)
。或者,如果您喜欢更详细的内容,请使用np.logical_and
。
顺便说一句,你不应该覆盖pdf
。子类只实现强调的方法:_pdf、_cdf等
基于@ev-bre的提示,以下工作:
from scipy.stats import *
class gapF_gen(rv_continuous):
''' Class for a flat distribution with a gap in it
s0, s1: bounds of support
l, h: gap
s0 < l < h < s1
'''
def _argcheck(self, s0, s1, l, h): return (s0 < l < h < s1)
def _get_support(self, s0, s1, l, h): return s0, s1
def _pdf(self, x, s0, s1, l, h):
return np.where(((s0 <= x) & (x <= l)) | ((h <= x) & (x <= s1)), 1 / (s1 - h + l - s0), 0)
gapF = gapF_gen(name='gapF')
bf = gapF(s0=-2.2, s1=4.77, l=-1.3, h=3.5)
print(bf.pdf(-2.8))
print(bf.pdf([-23.8, 3.8, 2.6, 6.9, 77.9, -2.1]))