scikit学习新手。我试图将逻辑回归拟合到一些虚构的数据中,但我得到了错误"X和y的形状不兼容。X有1个样本,但y有6个。">
import pandas as pd
from sklearn.linear_model import LogisticRegression
# Create a sample dataframe
data = [['Age', 'ZepplinFan'], [13 , 0], [40, 1], [25, 0], [55, 0], [51, 1], [58, 1]]
columns=data.pop(0)
df = pd.DataFrame(data=data, columns=columns)
# Fit Logistic Regression
lr = LogisticRegression()
lr.fit(X=df.Age.values, y = df.ZepplinFan)
这篇文章表明我需要以某种方式重塑df。Age.values到(n_samples,1(。我该怎么做?
形状很重要。一种方法是通过这样的列
In [24]: lr.fit(df[['Age']], df['ZepplinFan'])
Out[24]:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
intercept_scaling=1, penalty='l2', random_state=None, tol=0.0001)
如果你想明确地传递值,那么你可以
In [25]: lr.fit(df[['Age']].values, df['ZepplinFan'].values)
Out[25]:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
intercept_scaling=1, penalty='l2', random_state=None, tol=0.0001)
或者你可以newaxis
给你现有的语法,比如
In [26]: lr.fit(df.Age.values[:,np.newaxis], df.ZepplinFan.values)
Out[26]:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
intercept_scaling=1, penalty='l2', random_state=None, tol=0.0001)