r语言 - group by,然后展开以添加新列



我的数据看起来像:

patientid <- c(100,101,101,101,102,102)
weight <- c(1,1,2,3,1,2)
height <- c(0,6,0,0,0,1)
bmi <- c(0,5,0,0,0,1)

我想对患者id进行分组,以便数据框中每行只有1名患者。
然后将其他行作为附加列(通过在末尾添加一个数字来命名)。因此数据帧将是patientid、weight1、hight1、bmi1、weight2、hight2、bmi2等。列的数量将对应于重复的患者id的数量。

我认为group_by和spread是关键函数,但我不明白。在本例中,患者id为101的行只有highight1、bmi1和weight1列的值,患者101的值为weight1、highight1、bmi1、weight2、highight2、bmi2、weight3、highight3、bmi3,患者102的值为weight1、highight1、bmi1、weight2、highight2、bmi2。

使用ave+reshape的base R选项

reshape(
transform(
df,
q = ave(patientid, patientid, FUN = seq_along)
),
direction = "wide",
idvar = "patientid",
timevar = "q"
)

patientid weight.1 height.1 bmi.1 weight.2 height.2 bmi.2 weight.3 height.3
1       100        1        0     0       NA       NA    NA       NA       NA
2       101        1        6     5        2        0     0        3        0
5       102        1        0     0        2        1     1       NA       NA
bmi.3
1    NA
2     0
5    NA

也许,我们可以在通过'patientid'创建序列列后使用pivot_wider

library(tidyr)
library(data.table)
library(dplyr)
df1 %>% 
mutate(rn  = rowid(patientid)) %>% 
pivot_wider(names_from = rn, values_from = c(weight, height, bmi),
names_sep="")
输出:

# A tibble: 3 x 10
patientid weight1 weight2 weight3 height1 height2 height3  bmi1  bmi2  bmi3
<dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <dbl> <dbl> <dbl>
1       100       1      NA      NA       0      NA      NA     0    NA    NA
2       101       1       2       3       6       0       0     5     0     0
3       102       1       2      NA       0       1      NA     0     1    NA

数据:

df1 <- data.frame(patientid, weight, height, bmi)

group_by和spread应该是tidyverse的一部分,我想。

我用基础重塑重塑你的数据,并使用重量作为测量id。


patientid <- c(100,101,101,101,102,102)
weight <- c(1,1,2,3,1,2)
height <- c(0,6,0,0,0,1)
bmi <- c(0,5,0,0,0,1)
cat("datan")
df <- data.frame(patientid = patientid,
n = weight,
weight = weight,
height = height,
bmi = bmi)
df
cat("reshaped to wid formatn")
reshape(data = df,
idvar = "patientid",
timevar = "n",
# c("weight", "height", "bmi"),
direction = "wide")
#?reshape()

相关内容

  • 没有找到相关文章

最新更新