r语言 - 指示哪些相应的列具有 TRUE 指示符



我有以下数据集:

df<-data.frame(
identifer=c(1,2,3,4),
DF=c("Tablet","Powder","Suspension","System"),
DF_source1=c("Capsule","Powder,Metered","Tablet",NA),
DF_source2=c(NA,NA,"Tablet",NA),
DF_source3=c("Tablet, Extended Release","Liquid","Tablet",NA),
Route_source1=c("Oral","INHALATION","Oral",NA),
Route_source2=c(NA,"TOPICAL","Oral",NA),
Route_source3=c("Oral","IRRIGATION","oral",NA))

我想知道哪些DF_source与DF匹配,以及我应该选择哪条关联的路线。

我希望输出看起来像这样:

df_out<-data.frame(
identifer=c(1,2,3,4),
DF=c("Tablet","Powder","Suspension","System"),
DF_match=c("Tablet, Extended Release","Powder,Metered;Powder",NA,NA),
Route_match=c("Oral","INHALATION;TOPICAL",NA,NA),
DF_match_count=c(1,2,0,0),
DF_route_count=c(1,2,0,0))

我试过这个,但我不确定如何为DF_match和Route_匹配提取值

df%>%mutate_at(vars(matches("(DF_source)")),
list(string_detect = ~str_detect(tolower(DF),tolower(str_replace_all(.,"/|,(\s)?|(?<!,)\s","|")))))

任何帮助将不胜感激,谢谢!

我不完全确定这是你的想法,但希望这可能会有所帮助。

您的最终结果似乎与您的示例数据不匹配(例如,缺少 TOPICAL(。

使用pivot_longer以更整洁的形式这可能更容易。

编辑:如果列是因子,则转换为filterstr_detect的字符。

library(tidyverse)
library(stringr)
df %>%
mutate_if(is.factor, as.character) %>%
pivot_longer(cols = -c(identifer, DF), names_to = c(".value", "number"), names_pattern = "(\w+)(\d+)") %>%
filter(str_detect(DF_source, DF)) %>%
group_by(identifer) %>%
summarise(DF_match = paste(DF_source, collapse = ';'),
Route_match = paste(Route_source, collapse = ';'),
match_count = n()) %>%
right_join(df[,c("identifer", "DF")], by = "identifer") %>%
select(c(identifer, DF, DF_match, Route_match, match_count))

输出

# A tibble: 4 x 5
identifer DF         DF_match                 Route_match        match_count
<dbl> <chr>      <chr>                    <chr>                    <int>
1         1 Tablet     Tablet, Extended Release Oral                         1
2         2 Powder     Powder,Metered;Powder    INHALATION;TOPICAL           2
3         3 Suspension NA                       NA                          NA
4         4 System     NA                       NA                          NA

最新更新