我有一个数据帧,如果你看不到其他列,其中的一些列名将不清楚。例如列"blue1"。这意味着设计师马尔的蓝色椅子要花5美元。
data.frame(designer = c("mal", "didi", "yels", "don"),
trashcan = c(1:4),
chair = c(1:4), blue1 = c(5:8), yellow2 = c(6:9), orange3 = c(11:14),
bedframe = c(5:8), blue4 = c(6:9), yellow5 = c(11:14), orange6 = c(16:19))
# designer trashcan chair blue1 yellow2 orange3 bedframe blue4 yellow5 orange6
# 1 mal 1 1 5 6 11 5 6 11 16
# 2 didi 2 2 6 7 12 6 7 12 17
# 3 yels 3 3 7 8 13 7 8 13 18
# 4 don 4 4 8 9 14 8 9 14 19
# would like to get
# designer trashcan chair blue1_chair yellow2_chair orange3_chair bedframe blue4_bedframe yellow5_bedframe orange6_bedframe
# 1 mal 1 1 5 6 11 5 6 11 16
# 2 didi 2 2 6 7 12 6 7 12 17
# 3 yels 3 3 7 8 13 7 8 13 18
# 4 don 4 4 8 9 14 8 9 14 19
须知:
- 家具总是有0或3种其他颜色(垃圾桶有0种其他颜色(
- 这三种颜色总是紧跟在家具后面
- 颜色后面的数字可以是随机的
有什么建议吗?
我们可以在rename_with
中的两个步骤中完成这一操作,其中第一个步骤matches
是"蓝色"、"黄色"、"橙色",后跟1,第二个步骤匹配相同的前缀,后跟2,我们分别粘贴"_chair"、"_bedframe">
library(dplyr)
library(stringr)
df1 <- df1 %>%
rename_with(~ str_c(.x, "_chair"), matches("^(blue|yellow|orange)[0-3]$")) %>%
rename_with(~ str_c(.x, "_bedframe"),
matches("^(blue|yellow|orange)[4-6]$"))
-输出
df1
designer trashcan chair blue1_chair yellow2_chair orange3_chair bedframe blue4_bedframe yellow5_bedframe orange6_bedframe
1 mal 1 1 5 6 11 5 6 11 16
2 didi 2 2 6 7 12 6 7 12 17
3 yels 3 3 7 8 13 7 8 13 18
4 don 4 4 8 9 14 8 9 14 19
或base R
的另一个选项
nm1 <- names(df1)[3:ncol(df1)]
i1 <- !grepl('\d+$', nm1)
i2 <- cumsum(i1)
names(df1)[-(1:2)] <- ave(nm1, i2, FUN = function(x)
replace(x, -1, paste0(x[-1], "_", x[1])))
-输出
> df1
designer trashcan chair blue1_chair yellow2_chair orange3_chair bedframe blue4_bedframe yellow5_bedframe orange6_bedframe
1 mal 1 1 5 6 11 5 6 11 16
2 didi 2 2 6 7 12 6 7 12 17
3 yels 3 3 7 8 13 7 8 13 18
4 don 4 4 8 9 14 8 9 14 19