从我看到的所有答案来看,这在其他语言中似乎是一个常见的问题,但我在R数据帧中遇到了这个问题。我有一些数据是用美元符号和逗号导入的,我似乎无法将它们导出。我已经尝试了我能想到的所有str_sub类型的东西,但无法将它们取出。他们似乎只是被忽视了。
test.df <- structure(list(week_ending = structure(c(18747, 18747, 18747,18747, 18747, 18747, 18747, 18747, 18747, 18747), class = "Date"),store_num = c(7005, 7005, 7005, 7005, 7005, 7005, 7005, 7748,7748, NA), units = c("116", "1", "6", "2", "1", "1", "2","1", "46", "2,539"), cost = c("$699.36", "$14.29", "$34.02","$0.90", "$11.47", "$1.28", "$2.16", "$1.89", "$165.81","$16,250.83 "), dollars = c("$1,564.07 ", "$24.99", "$54.00","$9.98", "$24.99", "$2.99", "$7.98", "$4.99", "$360.11","$37,465.88 "), item_description = c("$1,564.07 ", "$24.99","$54.00", "$9.98", "$24.99", "$2.99", "$7.98", "$4.99", "$360.11","$37,465.88 ")), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L,702L, 703L, 704L), class = "data.frame")
您可以使用gsub
来替换/删除某些字符串。这里[$,]
用于指定搜索$
和,
,并用""
替换它们。
gsub("[$,]", "", test.df$cost)
# [1] "699.36" "14.29" "34.02" "0.90" "11.47" "1.28"
# [7] "2.16" "1.89" "165.81" "16250.83 "
使用parse_number
library(tidyverse)
test.df %>%
mutate(
across(.cols = c("units", "cost", "dollars", "item_description"),
.fns = parse_number)
)
week_ending store_num units cost dollars item_description
1 2021-04-30 7005 116 699.36 1564.07 1564.07
2 2021-04-30 7005 1 14.29 24.99 24.99
3 2021-04-30 7005 6 34.02 54.00 54.00
4 2021-04-30 7005 2 0.90 9.98 9.98
5 2021-04-30 7005 1 11.47 24.99 24.99
6 2021-04-30 7005 1 1.28 2.99 2.99
7 2021-04-30 7005 2 2.16 7.98 7.98
8 2021-04-30 7748 1 1.89 4.99 4.99
9 2021-04-30 7748 46 165.81 360.11 360.11
10 2021-04-30 NA 2539 16250.83 37465.88 37465.88