Pytorch/Tensorflow:计算高斯混合对数密度的梯度



我有三个高斯的混合物,希望使用Pytorch或Tensorflow计算对数密度的梯度。我该怎么做?

from numpy import eye, log
from scipy.stats import multivariate_normal as MVN
μs   = [[0, 0], [2, 0], [0, 2]]              # Means 
Σs   = [eye(2), eye(2), eye(2)]              # Covariance Matrices
cs   = [1 / 3] * 3                           # Mixture coefficients
MVNs = [MVN(μ, Σ) for (μ, Σ) in zip(μs, Σs)] # List of Gaussians
log_density = lambda x: log((sum([c * MVN.pdf(x) for (c, MVN) in zip(cs, MVNs)])))

从本质上讲,我想计算log_density的梯度。我尝试使用autograd.grad,但由于数组分配的原因,它失败了。

尝试的Pytorch解决方案

from torch import tensor, eye, sqrt, zeros, log, exp
from torch.distributions import MultivariateNormal as MVN
μs   = [tensor([0, 0]), tensor([2, 0]), tensor([0, 2])] # Means 
Σs   = [eye(2), eye(2), eye(2)]                         # Covariance Matrices
cs   = [1 / 3] * 3                                      # Mixture coefficients
MVNs = [MVN(μ, Σ) for (μ, Σ) in zip(μs, Σs)]            # List of Gaussians
log_density = lambda x: log((sum([c * exp(MVN.log_prob(x)) for (c, MVN) in zip(cs, MVNs)])))

尝试的Autograd解决方案(不起作用(

from numpy import eye, log, zeros
from scipy.stats import multivariate_normal as MVN
from autograd import grad
μs   = [[0, 0], [2, 0], [0, 2]]              # Means 
Σs   = [eye(2), eye(2), eye(2)]              # Covariance Matrices
cs   = [1 / 3] * 3                           # Mixture coefficients
MVNs = [MVN(μ, Σ) for (μ, Σ) in zip(μs, Σs)] # List of Gaussians
log_density = lambda x: log((sum([c * MVN.pdf(x) for (c, MVN) in zip(cs, MVNs)])))
gradient = grad(log_density)
# If you try using this gradient function you get an error
gradient(zeros(2))

我得到的错误是

ValueError:用序列设置数组元素。

天真的Autograd解决方案

当然,有一个糟糕的Autograd解决方案无法很好地扩展。例如

from autograd.numpy import log, eye, zeros, array
from autograd.scipy.stats import multivariate_normal as MVN
from autograd import grad
μs   = [[0, 0], [2, 0], [0, 2]]              # Means 
Σs   = [eye(2), eye(2), eye(2)]              # Covariance Matrices
cs   = [1 / 3] * 3                           # Mixture coefficients
def log_density(x):
return log((1/3) * MVN.pdf(x, zeros(2), eye(2)) + (1/3) * MVN.pdf(x, array([2, 0]), eye(2)) + (1/3) * MVN.pdf(x, array([0, 2]), eye(2)))
grad(log_density)(zeros(2))  # Works!

您可以进行

from torch import tensor, eye, sqrt, zeros, log, exp
from torch.distributions import MultivariateNormal as MVN
μs   = [tensor([0, 0]), tensor([2, 0]), tensor([0, 2])] # Means 
Σs   = [eye(2), eye(2), eye(2)]                         # Covariance Matrices
cs   = [1 / 3] * 3                                      # Mixture coefficients
MVNs = [MVN(μ, Σ) for (μ, Σ) in zip(μs, Σs)]            # List of Gaussians
x = tensor((0.0,0.0), requires_grad=True)
log_density = log((sum([c * exp(MVN.log_prob(x)) for (c, MVN) in zip(cs, MVNs)])))
log_density.backward()
print(x.grad)

它将在(0.0,0.0(打印梯度。然而,由于pytorch没有生成静态计算图,我找不到一种简单的方法来计算另一点的梯度,而不重建计算图。你可以尝试使用tensorflow,它可以让你对计算图有更多的控制,并允许你为梯度计算构建一个图。

编辑使用tensorflow,您可以执行类似的操作

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import tensorflow as tf
import tensorflow_probability as tfp
@tf.function
def mygrad(x):
print("building graph")
us   = tf.stack([tf.constant([0.0, 0.0]), tf.constant([2., 0.]), tf.constant([0., 2.])])
covs = tf.stack([tf.eye(2), tf.eye(2), tf.eye(2)])
cs   = tf.constant([1 / 3] * 3)
with tf.GradientTape() as gt:
gt.watch(x)
log_density = tf.math.log(tf.math.reduce_sum(tfp.distributions.MultivariateNormalTriL(us,covs).prob(x) * cs) )
return gt.gradient(log_density,x)
print(mygrad(tf.constant([0.0,0.0])).numpy()) #gradient at 0.0,0.0
print(mygrad(tf.constant([1.0,0.0])).numpy()) #gradient at 1.0,0.0

从本质上讲,您可以使用tf.GradientTape进行自动微分,并在tf.function中捕获计算图。关于非常广泛的Tensorflow API文档,有更多的背景信息。

最新更新