r语言 - 我收到错误"Error in knn(train = prc_train, test = prc_test, cl = prc_train_labels, : no missing va



我是使用R的数据科学的初学者,无法解决此错误,使用的数据集是前列腺癌症数据集。错误位于prc_test_pred,它表示knn中的错误(train=prc_train,test=prc_test,cl=prc_train_labels,:不允许丢失值。

stringsAsFactors = FALSE 
str(prc) 
prc <- prc[-1]  #removes the first variable(id) from the data set.
table(prc$diagnosis_result)  # it helps us to get the numbers of patients
prc$diagnosis <- factor(prc$diagnosis_result, levels = c("B", "M"), labels = c("Benign", "Malignant")) #rename
round(prop.table(table(prc$diagnosis)) * 100, digits = 1)  # it gives the result in the percentage form rounded of to 1 decimal place( and so it’s digits = 1)
normalize <- function(x) {
return ((x - min(x)) / (max(x) - min(x))) } #very important step (normalizes to a common scale)
prc_n <- as.data.frame(lapply(prc[2:9], normalize))
summary(prc_n$radius)
prc_train <- prc_n[1:65,]
prc_test <- prc_n[66:100,]
prc_train_labels <- prc[1:65, 1]
prc_test_labels <- prc[66:100, 1] 
library(class)
prc_test_pred <- knn(train = prc_train, test = prc_test, cl = prc_train_labels,k=10)
library(gmodels)
CrossTable(x=prc_test_labels, y=prc_test_pred, prop.chisq=FALSE) ```

我不确定你面临的是什么样的问题。可能是您的第一行(读取csv文件的那一行出现问题(。为了便于复制,这里有一个简单的分类器,使用KNN,使用,除了代码的第一行之外的所有内容。

#
prc <- read.csv("https://raw.githubusercontent.com/duttashi/learnr/master/data/misc/Prostate_Cancer.csv", header = TRUE, stringsAsFactors = FALSE)
prc <- prc[-1]  
prc$diagnosis <- factor(prc$diagnosis_result, levels = c("B", "M"), labels = c("Benign", "Malignant"))

normalize <- function(x) {
return ((x - min(x)) / (max(x) - min(x))) } 
prc_n <- as.data.frame(lapply(prc[2:9], normalize))

prc_train <- prc_n[1:65,]
prc_test <- prc_n[66:100,]
prc_train_labels <- prc[1:65, 1]
prc_test_labels <- prc[66:100, 1] 
library(class)
prc_test_pred <- knn(train = prc_train, test = prc_test, cl = prc_train_labels,k=10)
library(gmodels)
CrossTable(x=prc_test_labels, y=prc_test_pred, prop.chisq=FALSE) 
# -------------------------------------------------------------------------

# Cell Contents
#   |-------------------------|
#   |                       N |
#   |           N / Row Total |
#   |           N / Col Total |
#   |         N / Table Total |
#   |-------------------------|
#   
#   
#   Total Observations in Table:  35 
# 
# 
#                   | prc_test_pred 
#   prc_test_labels |         B |         M | Row Total | 
#   ----------------|-----------|-----------|-----------|
#                 B |         6 |        13 |        19 | 
#                   |     0.316 |     0.684 |     0.543 | 
#                   |     0.857 |     0.464 |           | 
#                   |     0.171 |     0.371 |           | 
#   ----------------|-----------|-----------|-----------|
#                 M |         1 |        15 |        16 | 
#                   |     0.062 |     0.938 |     0.457 | 
#                   |     0.143 |     0.536 |           | 
#                   |     0.029 |     0.429 |           | 
#   ----------------|-----------|-----------|-----------|
#      Column Total |         7 |        28 |        35 | 
#                   |     0.200 |     0.800 |           | 
#   ----------------|-----------|-----------|-----------|

希望你能复制同样的东西。

最新更新