r语言 - Pivot从长到宽使用dplyr pivot_wider?



我用长格式重复了葡萄糖的测量,如下所示:

mydata <- 
structure(list(
ID = c(4, 12, 24, 24, 24, 24, 24, 43, 50, 51, 52, 61, 67, 81, 82, 83, 88, 93, 93, 94, 100, 103, 105, 106, 107, 115, 117, 130, 130, 130, 130, 130, 130, 132, 136, 157, 173, 180, 194, 196, 230, 244, 245, 269, 288, 304, 316, 318, 334, 338, 338, 367, 378, 380), 
date = structure(c(15330, 15476, 17641, 17664, 17664, 17670, 17673, 18696, 18194, 16036, 16428, 16210, 16211, 17667, 16329, 17961, 18535, 16834, 18088, 18571, 16449, 18213, 18003, 17976, 16862, 17842, 18019, 17339, 18513, 18629, 18699, 18700, 18700, 18423, 17184, 17487, 16736, 18780, 16876, 16895, 17163, 17443, 18291, 18493, 18213, 17947, 18452, 17919, 18129, 18152, 18794, 18507, 18640, 18654), 
class = "Date"), 
name = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), 
.Label = "gluc", 
class = "factor"), 
value = c(5.6, 5.5, 6.5, 7.6, 7.7, 7.8, 7.4, 4.3, 4.7, 5.1, 4.3, 5.2, 5.1, 5.8, 10, 5.2, 8.7, 4.5, 6.1, 4.6, 6, 5.8, 5.9, 5.5, 5.3, 5.9, 10.1, 6.4, 21.2, 5.1, 5.9, 7.4, NA, 8, 9.5, 4.6, 7, 8.1, 5.5, 7, 5, 6.2, 4.9, 4.8, 8.3, 6, 5.5, 6.8, 6.1, 4.8, 6.3, 5.7, 6.2, 13.7)), 
row.names = c(NA, -54L), 
class = c("tbl_df", "tbl", "data.frame"))

head(mydata)
# A tibble: 6 x 4
ID date       name  value
<dbl> <date>     <fct> <dbl>
1     4 2011-12-22 gluc    5.6
2    12 2012-05-16 gluc    5.5
3    24 2018-04-20 gluc    6.5
4    24 2018-05-13 gluc    7.6
5    24 2018-05-13 gluc    7.7
6    24 2018-05-19 gluc    7.8

我正在尝试将此转换为宽幅格式。我试过:

# First try
lab_gluc_wide <- 
pivot_wider(
data=mydata, 
names_from=name, 
values_from=value, 
id_cols=c(ID, date))
# Second try
lab_gluc_wide <- 
pivot_wider(
data=mydata, 
names_from=name, 
values_from=c(value, date), 
id_cols=ID)

但是都产生警告消息

1: Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates 
2: Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates 

我要找的是每个患者一行,每个葡萄糖测量/日期有多个列。

您的问题是您的id也是在唯一的日子,所以如果您将数据重塑为宽格式,您还需要重塑日期列或删除它。在我的示例中,我删除了日期列。

library(tidyverse)
mydata %>%
group_by(ID) %>%
mutate(ID_ID = 1:n()) %>%
ungroup() %>%
pivot_wider(names_from = c(name, ID_ID),
id_cols = c(ID))

这给:

# A tibble: 43 x 7
ID gluc_1 gluc_2 gluc_3 gluc_4 gluc_5 gluc_6
<dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
1     4    5.6   NA     NA     NA     NA       NA
2    12    5.5   NA     NA     NA     NA       NA
3    24    6.5    7.6    7.7    7.8    7.4     NA
4    43    4.3   NA     NA     NA     NA       NA
5    50    4.7   NA     NA     NA     NA       NA
6    51    5.1   NA     NA     NA     NA       NA
7    52    4.3   NA     NA     NA     NA       NA
8    61    5.2   NA     NA     NA     NA       NA
9    67    5.1   NA     NA     NA     NA       NA
10    81    5.8   NA     NA     NA     NA       NA
# ... with 33 more rows

相关内容

  • 没有找到相关文章

最新更新