删除R中字符串中的美元符号和逗号



从我看到的所有答案来看,这在其他语言中似乎是一个常见的问题,但我在R数据帧中遇到了这个问题。我有一些数据是用美元符号和逗号导入的,我似乎无法将它们导出。我已经尝试了我能想到的所有str_sub类型的东西,但无法将它们取出。他们似乎只是被忽视了。

test.df <- structure(list(week_ending = structure(c(18747, 18747, 18747,18747, 18747, 18747, 18747, 18747, 18747, 18747), class = "Date"),store_num = c(7005, 7005, 7005, 7005, 7005, 7005, 7005, 7748,7748, NA), units = c("116", "1", "6", "2", "1", "1", "2","1", "46", "2,539"), cost = c("$699.36", "$14.29", "$34.02","$0.90", "$11.47", "$1.28", "$2.16", "$1.89", "$165.81","$16,250.83 "), dollars = c("$1,564.07 ", "$24.99", "$54.00","$9.98", "$24.99", "$2.99", "$7.98", "$4.99", "$360.11","$37,465.88 "), item_description = c("$1,564.07 ", "$24.99","$54.00", "$9.98", "$24.99", "$2.99", "$7.98", "$4.99", "$360.11","$37,465.88 ")), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L,702L, 703L, 704L), class = "data.frame")

您可以使用gsub来替换/删除某些字符串。这里[$,]用于指定搜索$,,并用""替换它们。

gsub("[$,]", "", test.df$cost)
# [1] "699.36"    "14.29"     "34.02"     "0.90"      "11.47"     "1.28"     
# [7] "2.16"      "1.89"      "165.81"    "16250.83 "

使用parse_number

library(tidyverse) 
test.df %>% 
mutate(
across(.cols = c("units", "cost", "dollars", "item_description"),
.fns = parse_number)
)

week_ending store_num units     cost  dollars item_description
1   2021-04-30      7005   116   699.36  1564.07          1564.07
2   2021-04-30      7005     1    14.29    24.99            24.99
3   2021-04-30      7005     6    34.02    54.00            54.00
4   2021-04-30      7005     2     0.90     9.98             9.98
5   2021-04-30      7005     1    11.47    24.99            24.99
6   2021-04-30      7005     1     1.28     2.99             2.99
7   2021-04-30      7005     2     2.16     7.98             7.98
8   2021-04-30      7748     1     1.89     4.99             4.99
9   2021-04-30      7748    46   165.81   360.11           360.11
10  2021-04-30        NA  2539 16250.83 37465.88         37465.88

最新更新