如果我不想使用类,如何从另一个函数访问动态变量?



我有两个功能是train功能和logreg功能,主要功能是在里面运行logreg功能的train。

当我执行列车功能时,它会给我错误,

NameError: name 'clf_hyper' is not defined

我想我没有得到导致logreg函数的clf_hyper变量,

Logreg函数,

from sklearn import model_selection
def logreg(clf,xtrain, ytrain):
# define a grid of parameter
# this can be a dictionary or a list of
# dictionaries
param_grid = {
#"solver": ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga'],
#"penalty": ['none', 'l1', 'l2', 'elasticnet'],
"C": [100, 10]
}
# initialize grid search
model = model_selection.GridSearchCV(
estimator = clf,
param_grid = param_grid,
scoring = "accuracy",
verbose = 10,
n_jobs = 1
)
# fit the model and extract best score
model.fit(xtrain, ytrain)
best_parameters = model.best_estimator_.get_params()
for param_name in sorted(param_grid.keys()):
print(f"t{param_name}: {best_parameters[param_name]}")
# initialize model with best_estimator_
clf_hyper = model.best_estimator_

列车功能,

import argparse
import os
import config
#import model_dispatcher
#import vectorizer_dispatcher
import dispatcher
import use_function
import hyperparameter
import pandas as pd
import joblib
from nltk.tokenize import word_tokenize
from sklearn import linear_model
from sklearn import metrics
from sklearn import model_selection
from sklearn.feature_extraction.text import CountVectorizer
def run(fold, model, vectorizer):
#read the training data
df = pd.read_csv(config.TRAINING_FILE)
# applying clean_text to Revies column
df.loc[:, 'Review'] = df.Review.apply(use_function.clean_text)
# training data is where kfold is not equal to provided fold
# also, note that we reset the index
df_train = df[df.kfold != fold].reset_index(drop=True)
# validation data is where kfold is equal to provided fold
df_test = df[df.kfold == fold].reset_index(drop=True)
# initialize CountVectorizer with NLTK,s word_tokenize
# function as tokenizer
vectorizer = dispatcher.vectorizers[vectorizer]
#fit count_vec on training data reviews
vectorizer.fit(df_train.Review)
#transform training and validation data reviews
xtrain = vectorizer.transform(df_train.Review)
xtest = vectorizer.transform(df_test.Review)
ytrain = df_train.Rating
# initialize model
clf = dispatcher.models[model]
#initialize hyperparameter if you want use
# if not just give # sign in
hyperparameter.logreg(clf,xtrain,ytrain)
#return clf value from hyperparameter function
#return clf_hyper

#fit the model on training data reviews and Rating
clf_hyper.fit(xtrain, df_train.Rating)
# make prediction on test data
# threshold for predictions is 0.5
preds = clf_hyper.predict(xtest)
#calculate accuracy
accuracy = metrics.accuracy_score(df_test.Rating, preds)
print(f"Fold={fold}")
print(f"Accuracy = {accuracy}")
print("")
# save the model
joblib.dump(clf,os.path.join(config.MODEL_OUTPUT, "dt_{fold}.bin")
)

那么,如果我不想使用class,如何在logreg函数中获得variableclfhyper,我将在train函数中使用它呢?谢谢

如果你真的想这么做(我不建议(,你可以将clf_ehyper定义为全局变量

def logreg(clf,xtrain, ytrain):
global clf_hyper 

如果可能的话,我会尝试直接从函数Logreg((返回变量clf_hypero。然后,您可以通过调用函数来获得clf_ehyper的值。

def logreg(clf, xtrain, ytrain):
...
return clf_hyper
def run(fold, model, vectorizer):
...
#return clf_hyper and exec the func
return hyperparameter.logreg(clf,xtrain,ytrain) 

最新更新