在Windows上使用Python 2.7。想要对一个分类问题拟合一个使用特征T1
和T2
的逻辑回归模型,目标是T3
。
我显示了T1
和T2
的值,以及我的代码。问题是,既然T1
有维度5,T2
有维度1,我们应该如何预处理它们,以便它可以被scikit-learn逻辑回归训练正确地利用?
BTW,我的意思是对于训练样本1,其T1
的特征为[ 0 -1 -2 -3]
, T2
的特征为[0]
,对于训练样本2,其T1的特征为[ 1 0 -1 -2]
, T2
的特征为[1]
,…
import numpy as np
from sklearn import linear_model, datasets
arc = lambda r,c: r-c
T1 = np.array([[arc(r,c) for c in xrange(4)] for r in xrange(5)])
print T1
print type(T1)
T2 = np.array([[arc(r,c) for c in xrange(1)] for r in xrange(5)])
print T2
print type(T2)
T3 = np.array([0,0,1,1,1])
logreg = linear_model.LogisticRegression(C=1e5)
# we create an instance of Neighbours Classifier and fit the data.
# using T1 and T2 as features, and T3 as target
logreg.fit(T1+T2, T3)
,
[[ 0 -1 -2 -3]
[ 1 0 -1 -2]
[ 2 1 0 -1]
[ 3 2 1 0]
[ 4 3 2 1]]
T2,
[[0]
[1]
[2]
[3]
[4]]
需要使用numpy.concatenate连接特征数据矩阵。
import numpy as np
from sklearn import linear_model, datasets
arc = lambda r,c: r-c
T1 = np.array([[arc(r,c) for c in xrange(4)] for r in xrange(5)])
T2 = np.array([[arc(r,c) for c in xrange(1)] for r in xrange(5)])
T3 = np.array([0,0,1,1,1])
X = np.concatenate((T1,T2), axis=1)
Y = T3
logreg = linear_model.LogisticRegression(C=1e5)
# we create an instance of Neighbours Classifier and fit the data.
# using T1 and T2 as features, and T3 as target
logreg.fit(X, Y)
X_test = np.array([[1, 0, -1, -1, 1],
[0, 1, 2, 3, 4,]])
print logreg.predict(X_test)