重新绑定两个数据表并在r中创建一个新列



我有两个数据集代表不同的组:

student_details <- c("John", "Henrick", "Maria", "Lucas", "Ali")
student_class <- c("High School", "College", "Preschool", "High School", "college")
df1 <- data.frame(student_details, student_class)

#另一个dataframe

Student_details<-c("Bracy","Evin")
Student_class<-c("High school","College")
Student_rank<-c("A","A+")
df2<-data.frame(Student_class,Student_details,Student_rank)
df2

我需要重新绑定df1和df2,即使长度是不相等的,并在最后创建第三列,称为&;dataset"指示它来自哪个数据集:

您可以使用data.table包中的rbindlist()函数来完成此操作。

两个数据框架中的列名必须相同,因为您希望通过列名进行绑定。

#convert uppercase letters in column names to lower case. 
names(df2) <- tolower(names(df2))

接下来,将它们绑定在一起:

library(data.table)
final_df <- rbindlist(list(df1, df2), use.names = T, fill = T, idcol = "dataset")
final_df 

输出:

dataset student_details student_class student_rank
1:       1            John   High School         <NA>
2:       1         Henrick       College         <NA>
3:       1           Maria     Preschool         <NA>
4:       1           Lucas   High School         <NA>
5:       1             Ali       college         <NA>
6:       2           Bracy   High school            A
7:       2            Evin       College           A+

我假设您的列名student_details,student_class在数据帧中是相同的。您可以使用比rbind更灵活的bind_rows。它将创建NA值。

student_details <- c("John", "Henrick", "Maria", "Lucas", "Ali")
student_class <- c("High School", "College", "Preschool", "High School", "college")
df1 <- data.frame(student_details, student_class)

student_details<-c("Bracy","Evin")
student_class<-c("High school","College")
student_rank<-c("A","A+")
df2<-data.frame(student_details,student_class,student_rank)
library(dplyr)
df_full<-bind_rows(df1,df2)

对于特定的df1df2,我们可以从基础R尝试merge

> merge(df1, df2, all = TRUE, sort = FALSE)
student_details student_class student_rank
1            John   High School         <NA>
2         Henrick       College         <NA>
3           Maria     Preschool         <NA>
4           Lucas   High School         <NA>
5             Ali       college         <NA>
6           Bracy   High school            A
7            Evin       College           A+

但是使用rbindlistdata.table选项应该在一般意义上工作(见@Flap的答案)

最新更新