将作为整数向量列表(不仅仅是单个整数)的数据帧列中的整数替换为 R 中的字符串



我有一个数据帧,其中的列实际上是整数向量(不仅仅是单个整数(的列表。

# make example dataframe
starting_dataframe <- 
  data.frame(first_names = c("Megan", 
                             "Abby", 
                             "Alyssa", 
                             "Alex", 
                             "Heather"))
starting_dataframe$player_indices <- 
  list(as.integer(1), 
       as.integer(c(2, 5)), 
       as.integer(3), 
       as.integer(4), 
       as.integer(c(6, 7)))

我想根据第二个索引数据帧用字符串替换整数。

# make concordance dataframe
example_concord <- 
  data.frame(last_names = c("Rapinoe", 
                            "Wambach", 
                            "Naeher", 
                            "Morgan", 
                            "Dahlkemper", 
                            "Mitts", 
                            "O'Reilly"), 
              player_ids = as.integer(c(1,2,3,4,5,6,7)))

所需的结果如下所示:

# make dataframe of desired result
desired_result <- 
  data.frame(first_names = c("Megan", 
                             "Abby", 
                             "Alyssa", 
                             "Alex", 
                             "Heather"))
desired_result$player_indices <- 
  list(c("Rapinoe"), 
       c("Wambach", "Dahlkemper"), 
       c("Naeher"), 
       c("Morgan"), 
       c("Mitts", "O'Reilly"))

我一辈子都想不通该怎么做,也没能在stackoverflow上找到类似的案例。 我该怎么做? 我不介意特别dplyr特定的解决方案。

我建议创建一个"查找字典",并在每个ID中lapply

example_concord_idx <- setNames(as.character(example_concord$last_names),
                                example_concord$player_ids)
example_concord_idx
#            1            2            3            4            5            6 
#    "Rapinoe"    "Wambach"     "Naeher"     "Morgan" "Dahlkemper"      "Mitts" 
#            7 
#   "O'Reilly" 
starting_dataframe$result <- 
  lapply(starting_dataframe$player_indices,
         function(a) example_concord_idx[a])
starting_dataframe
#   first_names player_indices              result
# 1       Megan              1             Rapinoe
# 2        Abby           2, 5 Wambach, Dahlkemper
# 3      Alyssa              3              Naeher
# 4        Alex              4              Morgan
# 5     Heather           6, 7     Mitts, O'Reilly

(代码高尔夫?

Map(`[`, list(example_concord_idx), starting_dataframe$player_indices)

对于tidyverse爱好者,我将 r2evans 接受的答案的后半部分改编为使用 map()%>%

require(tidyverse)
starting_dataframe <- 
  starting_dataframe %>% 
  mutate(
    result = map(.x = player_indices, .f = function(a) example_concord_idx[a])
  )

不过,绝对不会赢得代码高尔夫!

另一种方法是unlist列表列,并在修改其内容后对其进行relist

df1$player_indices <- relist(df2$last_names[unlist(df1$player_indices)], df1$player_indices)
df1
#>   first_names      player_indices
#> 1       Megan             Rapinoe
#> 2        Abby Wambach, Dahlkemper
#> 3      Alyssa              Naeher
#> 4        Alex              Morgan
#> 5     Heather     Mitts, O'Reilly

数据

## initial data.frame w/ list-column
df1 <- data.frame(first_names = c("Megan", "Abby", "Alyssa", "Alex", "Heather"), stringsAsFactors = FALSE)
df1$player_indices <- list(1, c(2,5), 3, 4, c(6,7))
## lookup data.frame
df2 <- data.frame(last_names = c("Rapinoe", "Wambach", "Naeher", "Morgan", "Dahlkemper", 
        "Mitts", "O'Reilly"), stringsAsFactors = FALSE)

注意:我stringsAsFactors = FALSE设置为在 data.frame 中创建字符列,但它也适用于因子列。

最新更新