R - 警告:尝试删除空格时"argument is not an atomic vector"



我正处于分析前整理数据的最后阶段,在删除数据表中的空白时遇到了一个我无法真正理解的问题。有关代码中步骤的描述,请参阅下面的完整代码。

从下一页开始(如何从字符串中删除所有空白?(,并尝试在其他页面中解决有关原子向量错误/警告的问题,但运气不佳。

在第6步,我收到了流动警告

In stri_replace_all_fixed(allData, " ", "") :
argument is not an atomic vector; coercing

在步骤7,以下警告

> #Change sold and taxed columes from character to numerical
> allData$SoldAmount <- as.numeric(allData$SoldAmount)
Warning message:
NAs introduced by coercion 
> allData$Tax <- as.numeric(allData$Tax)
Warning message:
NAs introduced by coercion

第6步和第7步似乎都在运行,但结果在两个列中都是NA(见图(

删除空白后的结果

下面列出了完整的代码,我很想知道如何让第6步和第7步给我一些没有空格和数字的列。

#Step 1: Load needed library 
library(tidyverse) 
library(rvest) 
library(jsonlite)
library(stringi)
#Step 2: Access the URL 
url <- "https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/10/" 
#Step 3: Direct JSON as format of data in URL 
data <- jsonlite::fromJSON(url, flatten = TRUE) 
#Step 4: Access all items in API 
totalItems <- data$TotalNumberOfItems 
#Step 5: Summarize all data from API 
allData <- paste0('https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/', totalItems,'/') %>% 
jsonlite::fromJSON(., flatten = TRUE) %>% 
.[1] %>% 
as.data.frame() %>% 
rename_with(~str_replace(., "ListItems.", ""), everything())
#Step 6: removing colums not needed
allData <- allData[, -c(1,4,8,9,11,12,13,14,15)]
#Step 6: remove whitespace in all colums
stri_replace_all_fixed(allData, " ", "")
#Step 7: Change sold and taxed columes from character to numerical
allData$SoldAmount <- as.numeric(allData$SoldAmount)
allData$Tax <- as.numeric(allData$Tax)

您调用stri_replace_all_fixed(allData, " ", ""),但忽略/放弃其输出保存到某个地方

#Step 6: remove whitespace in all colums
allData[] <- lapply(allData, gsub, pattern = " ", replacement = "")
#Step 7: Change sold and taxed columes from character to numerical
allData$SoldAmount <- as.numeric(allData$SoldAmount)
allData$Tax <- as.numeric(allData$Tax)
head(allData)
#     County Municipality      Tax SoldAmount           Type Date
# 1 Akershus        FROGN  2400000    2550000          Bolig 2004
# 2 Akershus        FROGN  2225000    2100000          Bolig 2004
# 3 Akershus          SKI  7600000   18000000    Næringstomt 2006
# 4  Østfold    SARPSBORG  3000000    3815000           Tomt 2004
# 5  Østfold        RYGGE 10000000   16000000 Næringseiendom 2006
# 6 Vestfold       LARVIK    61950      61950           Tomt 2013

或者,只对您需要的列执行一次操作:

# allData <- paste0(...) %>% ...
allData <- allData[, -c(1,4,8,9,11,12,13,14,15)]
allData[c("Tax", "SoldAmount")] <- lapply(allData[c("Tax", "SoldAmount")], function(z) as.numeric(gsub(" ", "", z)))
head(allData)
#     County Municipality      Tax SoldAmount           Type Date
# 1 Akershus        FROGN  2400000    2550000          Bolig 2004
# 2 Akershus        FROGN  2225000    2100000          Bolig 2004
# 3 Akershus          SKI  7600000   18000000    Næringstomt 2006
# 4  Østfold    SARPSBORG  3000000    3815000           Tomt 2004
# 5  Østfold        RYGGE 10000000   16000000 Næringseiendom 2006
# 6 Vestfold       LARVIK    61950      61950           Tomt 2013

只替换这两列的特殊性很重要,因为其他列中有很多值都有空格,我不知道你是否打算压缩它们:

str(sapply(allData, function(z) unique(grep(" ", z, value = TRUE)), simplify = FALSE))
# List of 6
#  $ County      : chr [1:2] "Møre og Romsdal" "Sogn- og fjordane"
#  $ Municipality: chr [1:4] "EVJE OG HORNNES" "VESTRE TOTEN" "ØSTRE TOTEN" "NORDRE LAND"
#  $ Tax         : chr [1:414] " 2 400 000" " 2 225 000" " 7 600 000" " 3 000 000" ...
#  $ SoldAmount  : chr [1:538] " 2 550 000" " 2 100 000" " 18 000 000" " 3 815 000" ...
#  $ Type        : chr "Annen kategori"
#  $ Date        : chr(0) 

相关内容

  • 没有找到相关文章

最新更新