如何使用特定列中的id在R中绑定data.frames



我有两个数据帧:

df1

Taxa        Env  Correlation
1  C1161         pH  -0.209916044
2  C1161         pH   0.101338976
3  C1161       Temp  -0.228451375
4  C1161       Temp  -0.218456646
5  C1161         TS   0.255112839
6   C26         NH4   0.379192859
7   C26        Prot   0.327016026
8   C26        Prot   0.602990615
9   C26       Carbo  -0.102919129
10  C26       Carbo   0.481216962
11 C1815         pH  -0.403348271
12 C1815         pH   0.126527189
13 C1815       Temp  -0.125038666
14 C1815       Temp  -0.343674237

df2

Domain                Phylum
C1161 Bacteria        Actinobacteria
C1714 Bacteria        Actinobacteria
C26   Bacteria         Bacteroidetes
C895  Bacteria            Firmicutes
C1020 Bacteria            Firmicutes
C1815 Bacteria unclassified_Bacteria
C26   Bacteria            Firmicutes
C1620 Bacteria            Firmicutes
C822  Bacteria            Firmicutes

我想使用df1的Taxa列中的id绑定两个数据框,并使用rowname与df2合并。

我的问题是我不能使用df1的行名,因为Taxa列中的id可以出现2次以上。

我想要这样的:

Taxa        Env  Correlation    Domain          Phylum
1  C1161         pH  -0.209916044   Bacteria        Actinobacteria
2  C1161         pH   0.101338976   Bacteria        Actinobacteria
3  C1161       Temp  -0.228451375   Bacteria        Actinobacteria
4  C1161       Temp  -0.218456646   Bacteria        Actinobacteria
5  C1161         TS   0.255112839   Bacteria        Actinobacteria
6   C26         NH4   0.379192859   Bacteria            Firmicutes
7   C26        Prot   0.327016026   Bacteria            Firmicutes
8   C26        Prot   0.602990615   Bacteria            Firmicutes
9   C26       Carbo  -0.102919129   Bacteria            Firmicutes
10  C26       Carbo   0.481216962   Bacteria            Firmicutes
11 C1815         pH  -0.403348271   Bacteria unclassified_Bacteria
12 C1815         pH   0.126527189   Bacteria unclassified_Bacteria
13 C1815       Temp  -0.125038666   Bacteria unclassified_Bacteria
14 C1815       Temp  -0.343674237   Bacteria unclassified_Bacteria

我试着:

cbind(df1$Taxa, df2)
merge(rownames(df2), df1, by = "Taxa")

感谢
library(tibble)
library(dplyr)
left_join(df1, rownames_to_column(df2, "Taxa"), by = "Taxa")

In base R:

df2$Taxa <- rownames(df2)
merge(df1, 
df2,
all.x = T,
by = "Taxa")

注意:我根据您提供的输出从df2中删除了C26的第一个实例。

Taxa   Env Correlation   Domain                Phylum
1  C1161    pH  -0.2099160 Bacteria        Actinobacteria
2  C1161    pH   0.1013390 Bacteria        Actinobacteria
3  C1161  Temp  -0.2284514 Bacteria        Actinobacteria
4  C1161  Temp  -0.2184566 Bacteria        Actinobacteria
5  C1161    TS   0.2551128 Bacteria        Actinobacteria
6    C26   NH4   0.3791929 Bacteria            Firmicutes
7    C26  Prot   0.3270160 Bacteria            Firmicutes
8    C26  Prot   0.6029906 Bacteria            Firmicutes
9    C26 Carbo  -0.1029191 Bacteria            Firmicutes
10   C26 Carbo   0.4812170 Bacteria            Firmicutes
11 C1815    pH  -0.4033483 Bacteria unclassified_Bacteria
12 C1815    pH   0.1265272 Bacteria unclassified_Bacteria
13 C1815  Temp  -0.1250387 Bacteria unclassified_Bacteria
14 C1815  Temp  -0.3436742 Bacteria unclassified_Bacteria

最新更新