关于运行简单估算器的不匹配括号的错误


dput(head("CustomTransformerData.csv"))

以下是我要做的:

将SimpleImputter类应用于数据,其中策略设置为平均值。这个步骤的名称应该是";"估算者";。

这是我正在使用的代码:

import numpy as np
import pandas as pd
import warnings
warnings.filterwarnings('ignore')
fileName = "CustomTransformerData.csv"
custom_transform = pd.read_csv("CustomTransformerData.csv")
data_num = custom_transform.drop(['x3'], axis = 1); #created the df for categorical data
data_cat = custom_transform.drop(['x1', 'x2', 'x4', 'x5'], axis = 1); #created the df for numerical data
#importing sklearn
from sklearn.base import BaseEstimator,TransformerMixin
##creating the transformer
class Assignment4Transformer(BaseEstimator, TransformerMixin):
def __init__(self, drop_x4 = True, y = None):
self.drop_x4 = drop_x4 #flag to drop the x4 column

def fit_transform(self, data, y=None):
return self
from sklearn.pipeline import Pipeline #importing the pipeline
from sklearn.impute import SimpleImputer #importing the SimpleImputer
from sklearn.preprocessing import StandardScaler #importint the preprocessor
def transform(self, data): #starting the function to determine x4
#not adding the x3 categorical data


if self.drop_x4: #a flag to catch and drop x4, giving a new index
data = np.delete(data, 2, axis=1)
return np.c_[data, new_col]
num_pipeline = Pipeline([('imputer', SimpleImputer(strategy='mean')]) # this is where I encounter the below error
File "/var/folders/5v/f6glw1515sqbvblc482qs47c0000gn/T/ipykernel_42484/2823414947.py", line 1
num_pipeline = Pipeline([('imputer', SimpleImputer(strategy='mean')])
 ^
SyntaxError: closing parenthesis ']' does not match opening parenthesis '('

或者,我也尝试了这个,没有出错,但下一个代码出错了:

num_pipeline = Pipeline([('imputer', SimpleImputer(strategy='mean')), 
('attribs_adder', Assignment4Transformer()),])
std_scaler= StandardScaler(num_pipeline)

# Splitting the independent and dependent variables
std_scaler = data_num.data
response = data_num.target

# standardization 
scale = object.fit_transform(data_num)

TypeError                                 Traceback (most recent call last)
/var/folders/5v/f6glw1515sqbvblc482qs47c0000gn/T/ipykernel_42484/1423714864.py in <module>
----> 1 std_scaler= StandardScaler(num_pipeline)
2 
3 # Splitting the independent and dependent variables
4 std_scaler = data_num.data
5 response = data_num.target
TypeError: __init__() takes 1 positional argument but 2 were given

所以我不确定走第二条路线是否真的正确,我只需要在这部分得到帮助:将自定义Assignment4Transformer类应用于数据。确保您的自定义转换器在放置4的位置使用默认参数x4.柱这个步骤的名称应该是";custom_trans";。

将StandardScaler类应用于数据。这个步骤的名称应该是";std_scaler?

数据:(因为它似乎没有通过

x1  x2  x3  x4  x5          
1   1.5 2.354152979 COLD    593 0.75            
2   2.5 3.31404772  WARM    340 2.083333333         
3   3.5 4.021604459 COLD    551 4.083333333         
4   4.5     COLD    2368    6.75            
5   5.5 5.847601001 WARM    2636    10.08333333         
6   6.5 7.229910044 WARM    2779    14.08333333         
7   7.5 7.997255234 HOT 1057    18.75           
8   8.5 9.203946542 COLD    819 24.08333333         
9   9.5 10.33534766 WARM    3349                
10  10.5    11.11214192 HOT 3235    36.75           
11  11.5    11.75961084 WARM    216 44.08333333         
12  12.5    12.62909577 WARM    2529    52.08333333         
13  13.5    14.08258887 COLD    1735    60.75           
14  14.5    14.65767801 HOT 1254    70.08333333         
15  15.5        HOT 1245    80.08333333         
16  16.6    17.18411403 WARM    310 90.75           
17  17.5    17.80077555 HOT 201 102.0833333         
18  18.5    18.57886101 HOT 1767    114.0833333 




在您的第一个代码块中

这个

num_pipeline = Pipeline([('imputer', SimpleImputer(strategy='mean')])

应该是这个

num_pipeline = Pipeline([('imputer', SimpleImputer(strategy='mean'))])

我在SimpleImputer(strategy='mean')后面加了一个右括号

在您的第二个代码块中

StandardScaler是一个类,在使用之前需要进行实例化。

在您的代码中:

num_pipeline = Pipeline([('imputer', SimpleImputer(strategy='mean')), 
('attribs_adder', Assignment4Transformer()),])
std_scaler= StandardScaler(num_pipeline)

您将num_pipeline赋予类,但应该定义std_scaler,然后对数据使用.fit.fit_transform,或者像一样将其添加到管道中

num_pipeline = Pipeline([('imputer', SimpleImputer(strategy='mean')), 
('attribs_adder', Assignment4Transformer()),])
std_scaler= StandardScaler()
# applying to data
std_scaler.fit_transform(some_data)
# adding to pipeline at 0th step
num_pipeline.steps.insert(0, ("scale", std_scaler))
# last step
num_pipeline.steps.extend([("scale", std_scaler)])
# some other step???
num_pipeline.steps.insert(1, ("scale", std_scaler))

最新更新