根据R中行的内容重新组织数据帧元素



我有这个数据集:

df <- structure(list(V1 = c("B1D01", "B1D01", "B1D01", "B1D01", "B1D01", 
"B1D01", "U0155"), V2 = c("U0155", "U0155", "U0155", "U0155", 
"U0155", "U0155", "U3003"), V3 = c("U3003", "U3003", "C1B00", 
"U3003", "U3003", "U3003", "C1B00"), V4 = c("C1B00", "C1B00", 
"U0073", "C1B00", "C1B00", "C1B00", "P037D"), V5 = c("P037D", 
"P037D", NA, "P037D", "P037D", "P037D", "P0616"), V6 = c("P0616", 
"P0616", NA, "P0616", "P0616", "P0616", "P0562"), V7 = c("P0562", 
"P0562", NA, "P0562", "P0562", "P0562", "U0073"), V8 = c("U0073", 
"U0073", NA, "U0073", "U0073", "U0073", NA)), .Names = c("V1", 
"V2", "V3", "V4", "V5", "V6", "V7", "V8"), row.names = 1719:1725, class = "data.frame")

当我print(df):

V1    V2    V3    V4    V5    V6    V7    V8
1719 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1720 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1721 B1D01 U0155 C1B00 U0073  <NA>  <NA>  <NA>  <NA>
1722 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1723 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1724 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1725 U0155 U3003 C1B00 P037D P0616 P0562 U0073  <NA>

正如您所观察到的,这些代码中存在混合。例如,U3003主要在V3中,但也可以在V2(最后一行(中显示。

我想在以下条件下重新组织这个数据帧:

  • 每个代码都可以放在一列中
  • 列的名称应该是代码的名称
  • 如果代码多于8列,则列数可能反映代码数
  • 单元格值可能保留代码的名称
  • 如果代码不在一行中,则必须显示NA

请注意,我的原始数据帧包含的行比这个从原始数据中提取的小示例多得多。

我发现的最好的方法是"按摩"数据帧,转向更长的形式,然后将其恢复到初始形式:

library(tidyverse)
df %>% 
rownames_to_column() %>% 
pivot_longer(-rowname, values_drop_na = TRUE) %>% 
pivot_wider(rowname, names_from = value, values_from = value)
#> # A tibble: 7 x 9
#>   rowname B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#>   <chr>   <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1719    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 2 1720    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 3 1721    B1D01 U0155 <NA>  C1B00 <NA>  <NA>  <NA>  U0073
#> 4 1722    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 5 1723    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 6 1724    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 7 1725    <NA>  U0155 U3003 C1B00 P037D P0616 P0562 U0073

由reprex包(v0.3.0(于2020-04-03创建

相关内容

最新更新