我有一个这样的数据集:
data.frame(x = c(1:5), y = c(0:4), z = c(2:6))
x y z
1 1 0 2
2 2 1 3
3 3 2 4
4 4 3 5
5 5 4 6
我想得到一个这样的数据集:
x y z y-x z-y
1 1 0 2 -1 2
2 2 1 3 -1 2
3 3 2 4 -1 2
4 4 3 5 -1 2
5 5 4 6 -1 2
当我使用:
a <- a %>% mutate(across((x:z), ~. - lag(.)))
我:
x y z
1 NA NA NA
2 1 1 1
3 1 1 1
4 1 1 1
5 1 1 1
也就是说,突变是在同一列中进行减法,而我需要在不同的列中进行减法。我该如何解决这个问题?
我不会使用dplyr
。我将直接使用基数R:
diff_cols = your_data[-1] - your_data[-ncol(your_data)]
names(diff_cols) = paste0(
names(your_data)[-1],
"-",
names(your_data)[-ncol(your_data)]
)
cbind(your_data, diff_cols)
# x y z y-x z-y
# 1 1 0 2 -1 2
# 2 2 1 3 -1 2
# 3 3 2 4 -1 2
# 4 4 3 5 -1 2
# 5 5 4 6 -1 2
你可以用
library(dplyr)
df %>%
mutate(across(x:y,
~. - df[[names(df)[which(names(df) == cur_column()) + 1]]],
.names = "{.col}-{names(df)[which(names(df) == .col) + 1]}")
)
这返回
x y z x-y y-z
1 1 0 2 1 -2
2 2 1 3 1 -2
3 3 2 4 1 -2
4 4 3 5 1 -2
5 5 4 6 1 -2
Warning message:
Problem while computing `..1 = across(...)`.
ℹ longer object length is not a multiple of shorter object length
但是会抛出一个无法移除的警告。🤔
我们可以使用across2
library(dplyover)
a %>%
mutate(across2(y:z, x:y, `-`))
x y z y_x z_y
1 1 0 2 -1 2
2 2 1 3 -1 2
3 3 2 4 -1 2
4 4 3 5 -1 2
5 5 4 6 -1 2
如果列名应该是-
而不是_
,
a %>%
mutate(across2(y:z, x:y, `-`, .names = "{xcol}-{ycol}"))
x y z y-x z-y
1 1 0 2 -1 2
2 2 1 3 -1 2
3 3 2 4 -1 2
4 4 3 5 -1 2
5 5 4 6 -1 2
或dplyr
使用两个across
library(dplyr)
a %>%
mutate(across(y:z, .names = "{.col}-{names(a)[match(.col, names(a))-1]}") -
across(x:y))
与产出
x y z y-x z-y
1 1 0 2 -1 2
2 2 1 3 -1 2
3 3 2 4 -1 2
4 4 3 5 -1 2
5 5 4 6 -1 2
使用dplyr
您可以这样做:
library(dplyr, warn.conflicts = FALSE)
df1 <- data.frame(x = c(1:5), y = c(0:4), z = c(2:6))
df1 |>
mutate(`y-x` = y - x,
`z-y` = z - y)
#> # A tibble: 5 × 5
#> # Rowwise:
#> x y z `y-x` `z-y`
#> <int> <int> <int> <int> <int>
#> 1 1 0 2 -1 2
#> 2 2 1 3 -1 2
#> 3 3 2 4 -1 2
#> 4 4 3 5 -1 2
#> 5 5 4 6 -1 2
创建于2022-12-27使用reprex v2.0.2
这里有一个tidyr::pivot_longer + dplyr方法。相同的代码应该适用于任意数量的列。
df1 <- data.frame(x = c(1:5), y = c(0:4), z = c(2:6)) %>%
mutate(row = row_number()) %>%
pivot_longer(-row)
bind_rows(df1,
df1 %>%
group_by(row) %>%
mutate(name = paste0(name, "-", lag(name)), value = value - lag(value)) %>%
ungroup() %>% filter(!is.na(value))) %>%
pivot_wider(names_from = name, values_from = value)
结果
# A tibble: 5 × 6
row x y z `y-x` `z-y`
<int> <int> <int> <int> <int> <int>
1 1 1 0 2 -1 2
2 2 2 1 3 -1 2
3 3 3 2 4 -1 2
4 4 4 3 5 -1 2
5 5 5 4 6 -1 2