我已经能够像这样合并数据帧:
df1 <- read.table(text="
col1 col2 colx
A 5 hh
B 3 jj
C 6 kk
E 7 mm", header=TRUE, stringsAsFactors=FALSE)
df2 <- read.table(text="
col3 col4 coly
A 5 be
B 3 to
C 6 go
E 7 yo
", header=T, stringsAsFactors=FALSE)
full_join(df1, df2, by = c('col1'='col3',"col2" = "col4"))
这给了我这个:
col1 col2 colx coly
1 A 5 hh be
2 B 3 jj to
3 C 6 kk go
4 E 7 mm yo
但现在我需要将df1和df3合并,类似于%中的"A|B"中的"A"%
df3 <- read.table(text="
col3 col4 coly
'A | B' 5 be
'B | C' 3 to
C 6 go
E 7 yo
", header=T, stringsAsFactors=FALSE)
这可能吗?
我们可以在删除|
前后的空格后使用regex_full_join
library(dplyr)
library(fuzzyjoin)
library(stringr)
df3 %>%
mutate(col3 = str_remove_all(col3, "\s+")) %>%
regex_full_join(df1, ., by = c('col1' = 'col3', 'col2' = 'col4'))
-输出
# col1 col2 colx col3 col4 coly
#1 A 5 hh A|B 5 be
#2 B 3 jj B|C 3 to
#3 C 6 kk C 6 go
#4 E 7 mm E 7 yo