感谢您阅读本文。
我正在尝试使用 theano 实现多标签逻辑回归:
import numpy
import theano
import theano.tensor as T
rng = numpy.random
examples = 5
features = 10
labels = 2
D = (rng.randn(examples, labels, features), rng.randint(size=(labels, examples), low=0, high=2))
training_steps = 10000
# Declare Theano symbolic variables
x = T.matrix("x")
y = T.vector("y")
w = theano.shared(rng.randn(1 , labels ,features), name="w")
b = theano.shared(0., name="b")
print "Initial model:"
print w.get_value(), b.get_value()
# Construct Theano expression graph
p_1 = 1 / (1 + T.exp(-T.dot(x, w) - b)) # Probability that target = 1
prediction = p_1 > 0.5 # The prediction thresholded
xent = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # Cross-entropy loss function
cost = xent.mean() + 0.01 * (w ** 2).sum()# The cost to minimize
gw, gb = T.grad(cost, [w, b]) # Compute the gradient of the cost
# (we shall return to this in a
# following section of this tutorial)
# Compile
train = theano.function(
inputs=[x,y],
outputs=[prediction, xent],
updates=((w, w - 0.1 * gw), (b, b - 0.1 * gb)),
name='train')
predict = theano.function(inputs=[x], outputs=prediction , name='predict')
# Train
for i in range(training_steps):
pred, err = train(D[0], D[1])
print "Final model:"
print w.get_value(), b.get_value()
print "target values for D:", D[1]
print "prediction on D:", predict(D[0])
但是 -T.dot(x, w) 产品失败并出现此错误:
TypeError:("在索引 0(基于 0)处名称为"train"的 theano 函数的输入参数错误"、"错误的维数:预期为 2,得到 3 个形状 (5、10、2)"。
x 的形状为 (5, 2, 10) 和 W (1, 2, 10)。我希望点积具有形状 (5,2)。
我的问题是:有没有办法做这个内在产品?您认为有更好的方法可以实现多标签逻辑回归吗?
谢谢!
----编辑-----
所以这是我想使用 numpy 做的实现。
x = rng.randn(examples,labels,features)
w = rng.randn (labels,features)
dot = numpy.zeros((examples,labels))
for example in range(examples):
for label in range(labels):
dot[example,label] = x[example,label,:].dot(w[label,:])
print dot
输出:
[[-1.70321498 2.51088139]
[-5.73608956 0.1066286 ]
[ 2.31334531 3.31892284]
[ 1.56301872 -0.56150922]
[-1.98815855 -2.98866706]]
但我不知道如何使用Theano象征性地做到这一点。
经过几个小时的战斗,这似乎产生了正确的结果:
我有一个错误,输入为rng.randn(示例,功能,标签)而不是rng.randn(示例,功能)。这意味着,除了具有更多标签外,输入的大小也应相同。
以正确的方式计算点积的方法是使用theano.scan方法,例如:结果, updates = theano.scan(lambda label: T.dot(x, w[label,:]) - b[label], sequences=T.arange(labels))
感谢大家的帮助!
import numpy as np
import theano
import theano.tensor as T
rng = np.random
examples = 5
features = 10
labels = 2
D = (rng.randn(examples,features), rng.randint(size=(labels, examples), low=0, high=2))
training_steps = 10000
# Declare Theano symbolic variables
x = T.matrix("x")
y = T.matrix("y")
w = theano.shared(rng.randn(labels ,features), name="w")
b = theano.shared(np.zeros(labels), name="b")
print "Initial model:"
print w.get_value(), b.get_value()
results, updates = theano.scan(lambda label: T.dot(x, w[label,:]) - b[label], sequences=T.arange(labels))
# Construct Theano expression graph
p_1 = 1 / (1 + T.exp(- results)) # Probability that target = 1
prediction = p_1 > .5 # The prediction thresholded
xent = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # Cross-entropy loss function
cost = xent.mean() + 0.01 * (w ** 2).sum()# The cost to minimize
gw, gb = T.grad(cost, [w, b]) # Compute the gradient of the cost
# (we shall return to this in a
# following section of this tutorial)
# Compile
train = theano.function(
inputs=[x,y],
outputs=[prediction, xent],
updates=((w, w - 0.1 * gw), (b, b - 0.1 * gb)),
name='train')
predict = theano.function(inputs=[x], outputs=prediction , name='predict')
# Train
for i in range(training_steps):
pred, err = train(D[0], D[1])
print "Final model:"
print w.get_value(), b.get_value()
print "target values for D:", D[1]
print "prediction on D:", predict(D[0])