r语言 - 整洁模型:执行 PCR 时出现问题 错误:无法对不存在的列进行子集



我正在尝试用tidymols进行PCR,但我一直遇到这个问题。我知道有一个类似的帖子,但那里的解决方案对我的情况不起作用。

我的数据

library(AppliedPredictiveModeling)
data(solubility)
train = solTrainY %>% bind_cols(solTrainXtrans) %>% rename(solubility = ...1)

我的PCR分析

train %<>% mutate_all(., as.numeric) %>% glimpse()
tidy_rec = recipe(solubility ~ ., data = train) %>%
step_corr(all_predictors(), threshold = 0.9) %>%
step_pca(all_predictors(), num_comp = ncol(train)-1) %>% 
prep()
tidy_rec %>% tidy(2) %>% select(terms) %>% distinct()
tidy_predata = tidy_rec %>% juice()
# Re-sampling
tidy_folds = vfold_cv(train, v = 10)
# Set model
tidy_rlm = linear_reg() %>% 
set_mode("regression") %>% 
set_engine("lm")
# Set workflow
tidy_wf = workflow() %>% 
add_recipe(tidy_rec) %>% 
add_model(tidy_rlm) 
# Fit model
tidy_fit = tidy_wf %>% 
fit_resamples(tidy_folds) 
tidy_fit %>% collect_metrics()

错误

x Fold01: recipe: Error: Can't subset columns that don't exist.
x Columns `PC1`, `PC2`, `PC3`, `PC4`, and `PC5` don't exist.
x Fold02: recipe: Error: Can't subset columns that don't exist.
x Columns `PC1`, `PC2`, `PC3`, `PC4`, and `PC5` don't exist.
x Fold03: recipe: Error: Can't subset columns that don't exist.
x Columns `PC1`, `PC2`, `PC3`, `PC4`, and `PC5` don't exist.
x Fold04: recipe: Error: Can't subset columns that don't exist.
x Columns `PC1`, `PC2`, `PC3`, `PC4`, and `PC5` don't exist.
x Fold05: recipe: Error: Can't subset columns that don't exist.
x Columns `PC1`, `PC2`, `PC3`, `PC4`, and `PC5` don't exist.
x Fold06: recipe: Error: Can't subset columns that don't exist.
.
.
.

这是因为workflow需要一个未准备好的配方规范。

因此,在您的代码中,从配方规范中删除prep()将消除错误。

tidy_rec <- recipe(solubility ~ ., data = train) %>%
step_corr(all_predictors(), threshold = 0.9) %>%
step_pca(all_predictors(), num_comp = ncol(train)-1) 
# remove the prep() method

相关内容

  • 没有找到相关文章

最新更新