为训练和测试数据集做CV时出错



我正在尝试为我的训练和测试数据集做CV。我正在使用LinearRegressor。然而,当我运行代码时,我会得到下面的错误。如何解决此问题?我的简历部分代码正确吗?感谢您的帮助。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。

CV代码参考:scikit learn cross_validation over-fitting or under-fitting

X_normalized, y_for_normalized = scaled_df[[ "Part's Z-Height (mm)","Part's Solid Volume (cm^3)","Layer Height (mm)","Printing/Scanning Speed (mm/s)","Part's Orientation (Support's volume) (cm^3)"]], scaled_df [["Climate change (kg CO2 eq.)","Climate change, incl biogenic carbon (kg CO2 eq.)","Fine Particulate Matter Formation (kg PM2.5 eq.)","Fossil depletion (kg oil eq.)","Freshwater Consumption (m^3)","Freshwater ecotoxicity (kg 1,4-DB eq.)","Freshwater Eutrophication (kg P eq.)","Human toxicity, cancer (kg 1,4-DB eq.)","Human toxicity, non-cancer (kg 1,4-DB eq.)","Ionizing Radiation (Bq. C-60 eq. to air)","Land use (Annual crop eq. yr)","Marine ecotoxicity (kg 1,4-DB eq.)","Marine Eutrophication (kg N eq.)","Metal depletion (kg Cu eq.)","Photochemical Ozone Formation, Ecosystem (kg NOx eq.)","Photochemical Ozone Formation, Human Health (kg NOx eq.)","Stratospheric Ozone Depletion (kg CFC-11 eq.)","Terrestrial Acidification (kg SO2 eq.)","Terrestrial ecotoxicity (kg 1,4-DB eq.)"]]. 
Part's Z-Height (mm)    Part's Solid Volume (cm^3)  Layer Height (mm)   Printing/Scanning Speed (mm/s)  Part's Orientation (Support's volume) (cm^3)    Climate change (kg CO2 eq.) Climate change, incl biogenic carbon (kg CO2 eq.)   Fine Particulate Matter Formation (kg PM2.5 eq.)    Fossil depletion (kg oil eq.)   Freshwater Consumption (m^3)    Freshwater ecotoxicity (kg 1,4-DB eq.)  Freshwater Eutrophication (kg P eq.)    Human toxicity, cancer (kg 1,4-DB eq.)  Human toxicity, non-cancer (kg 1,4-DB eq.)  Ionizing Radiation (Bq. C-60 eq. to air)    Land use (Annual crop eq. yr)   Marine ecotoxicity (kg 1,4-DB eq.)  Marine Eutrophication (kg N eq.)    Metal depletion (kg Cu eq.) Photochemical Ozone Formation, Ecosystem (kg NOx eq.)   Photochemical Ozone Formation, Human Health (kg NOx eq.)    Stratospheric Ozone Depletion (kg CFC-11 eq.)   Terrestrial Acidification (kg SO2 eq.)  Terrestrial ecotoxicity (kg 1,4-DB eq.)
0   0.258287    0.005030    0.0 0.666667    0.040088    0.069825    0.056976    0.083205    0.010373    0.113808    0.104798    0.086400    0.110358    0.012836    0.091120    0.108676    0.090401    0.087426    0.125608    0.079028    0.080495    0.078380    0.082404    0.045040
1   0.258287    0.005030    0.2 0.666667    0.036597    0.041682    0.022880    0.074884    0.004841    0.045640    0.102285    0.082884    0.044202    0.005414    0.086700    0.105749    0.087161    0.084130    0.060373    0.072878    0.073529    0.074829    0.075438    0.018122
2   0.258287    0.009557    0.4 0.666667    0.031013    0.033310    0.012113    0.073035    0.003458    0.023401    0.102914    0.082494    0.022690    0.003231    0.086279    0.105749    0.086937    0.084130    0.039708    0.071341    0.071981    0.074698    0.073447    0.009856
3   0.258287    0.009054    0.6 0.666667    0.031013    0.029213    0.006954    0.072111    0.002766    0.012936    0.102914    0.082103    0.012524    0.001921    0.086069    0.105423    0.086602    0.084130    0.029579    0.070572    0.071207    0.074435    0.072452    0.005723
4   0.258287    0.010060    1.0 0.666667    0.031711    0.025650    0.001795    0.071803    0.003458    0.002180    0.103542    0.082884    0.002063    0.001048    0.086490    0.106074    0.087049    0.084542    0.019449    0.070572    0.071207    0.074961    0.072452    0.001908
5   0.258287    0.005030    0.0 0.000000    0.040088    0.074279    0.062360    0.084129    0.011065    0.125000    0.104798    0.086790    0.121114    0.014146    0.091330    0.108676    0.091519    0.087426    0.136143    0.080566    0.081269    0.078511    0.083400    0.049385
6   0.258287    0.038226    0.0 0.666667    0.040088    0.097791    0.074249    0.109091    0.038036    0.135174    0.129299    0.111788    0.132164    0.024625    0.116582    0.133725    0.116102    0.112970    0.154781    0.105166    0.106037    0.104419    0.108280    0.064222
7   0.137212    0.004527    0.0 0.666667    0.030314    0.058247    0.046433    0.076117    0.003458    0.095349    0.099144    0.080150    0.092382    0.008907    0.084806    0.102821    0.084702    0.081246    0.106159    0.072878    0.073529    0.072199    0.075438    0.035608
8   0.137212    0.004527    0.2 0.666667    0.029616    0.035269    0.017721    0.069954    0.000000    0.037355    0.098516    0.078197    0.036246    0.002794    0.082281    0.101520    0.082803    0.080010    0.051053    0.068266    0.068885    0.070489    0.070462    0.013247
9   0.137212    0.010060    0.4 0.666667    0.028918    0.031706    0.010543    0.072111    0.002766    0.020494    0.102285    0.081712    0.019891    0.002358    0.085438    0.104773    0.086043    0.083306    0.036467    0.070572    0.071207    0.073908    0.072452    0.008372
10  0.137212    0.010060    0.6 0.666667    0.028220    0.027431    0.005384    0.070878    0.001383    0.010320    0.101657    0.080931    0.010019    0.001484    0.084806    0.104448    0.085373    0.082894    0.026742    0.069803    0.070433    0.073251    0.071457    0.004345
11  0.137212    0.009557    1.0 0.666667    0.027522    0.022800    0.000000    0.069029    0.000000    0.000000    0.101029    0.080150    0.000000    0.000000    0.083754    0.103472    0.084367    0.081658    0.016613    0.068266    0.068885    0.072330    0.070462    0.000000
12  0.137212    0.004527    0.0 0.000000    0.030314    0.062879    0.052266    0.077042    0.004149    0.107122    0.099144    0.080541    0.103875    0.010217    0.085227    0.102821    0.085037    0.081658    0.117099    0.073647    0.074303    0.072462    0.076433    0.040165
13  0.137212    0.037723    0.0 0.666667    0.030314    0.085857    0.063257    0.102003    0.031120    0.116134    0.123645    0.105929    0.112568    0.020695    0.110269    0.127544    0.110515    0.106790    0.134522    0.098247    0.099071    0.097843    0.101314    0.053624
14  0.077118    0.004527    0.0 0.666667    0.054050    0.080335    0.064827    0.091217    0.018672    0.126453    0.111709    0.093821    0.122145    0.016766    0.098485    0.115833    0.098223    0.094842    0.139789    0.087485    0.088235    0.085876    0.090366    0.052777
15  0.077118    0.004527    0.0 0.000000    0.054050    0.085144    0.070884    0.092450    0.019364    0.138081    0.111709    0.094211    0.133638    0.018075    0.099116    0.116158    0.098223    0.094842    0.151135    0.088253    0.089009    0.086139    0.091361    0.057864
16  0.077118    0.004527    0.0 0.333333    0.054050    0.082472    0.067519    0.091834    0.019364    0.132267    0.111709    0.094211    0.127744    0.017639    0.098695    0.116158    0.098223    0.094842    0.144652    0.087485    0.088235    0.086007    0.091361    0.054684
lin_regressor = LinearRegression()

# pass the order of your polynomial here  
poly = PolynomialFeatures(1)

# convert to be used further to linear regression
X_transform = poly.fit_transform(x_train)

# fit this to Linear Regressor
linear_regg=lin_regressor.fit(X_transform,y_train).                                               

import numpy as np
from sklearn.metrics import SCORERS
from sklearn.model_selection import KFold
scorer = SCORERS['r2']
cv = KFold(n_splits=5, random_state=0,shuffle=True)
train_scores, test_scores = [], []
for train, test in cv.split(X_normalized):
X_transform2 = poly.fit_transform(X_normalized)
OL=lin_regressor.fit(X_transform2.iloc[train], y_for_normalized.iloc[train])
tr_21 = OL.score(X_train, y_train)
ts_21 = OL.score(X_test, y_test)
print ("Train score:", tr_21) # from documentation .score returns r^2
print ("Test score:", ts_21)   # from documentation .score returns r^2

train_scores.append(tr_21)
test_scores.append(ts_21)

print ("The Mean for Train scores is:",(np.mean(train_scores)))

print ("The Mean for Test scores is:",(np.mean(test_scores)))



''

--------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/var/folders/mm/r4gnnwl948zclfyx12w803040000gn/T/ipykernel_73165/2276765730.py in <module>
10 for train, test in cv.split(X_normalized):
11     X_transform2 = poly.fit_transform(X_normalized)
---> 12     OL=lin_regressor.fit(X_transform2.iloc[train], y_for_normalized.iloc[train])
13     tr_21 = OL.score(X_train, y_train)
14     ts_21 = OL.score(X_test, y_test)
AttributeError: 'numpy.ndarray' object has no attribute 'iloc'

''

试试这个

from sklearn.model_selection import KFold
model=LinearRegression()
kfold_validation=KFold(10)
import numpy as np
from sklearn.model_selection import cross_val_score
results=cross_val_score(model,X,y,cv=kfold_validation)
print(results)
print(np.mean(results))

这是经过编辑的代码

import numpy as np
import pandas as pd
from sklearn.metrics import SCORERS
from sklearn.model_selection import KFold
scorer = SCORERS['r2']
cv = KFold(n_splits=5, random_state=0,shuffle=True)
train_scores, test_scores = [], []
for train, test in cv.split(X_normalized):
X_transform2 = poly.fit_transform(X_normalized)
OL=lin_regressor.fit(X_transform2, y_for_normalized)
tr_21 = OL.score(X_train, y_train)
ts_21 = OL.score(X_test, y_test)
print ("Train score:", tr_21) # from documentation .score returns r^2
print ("Test score:", ts_21)   # from documentation .score returns r^2

train_scores.append(tr_21)
test_scores.append(ts_21)

print ("The Mean for Train scores is:",(np.mean(train_scores)))

print ("The Mean for Test scores is:",(np.mean(test_scores)))

注意-我有些担心你的y_for_normalize看起来怎么样。因为我认为你的代码本身是正确的,你现在正在正确地可视化数据。请检查此代码。

最新更新