我有一个3686行34列的数据帧。当我用write.csv2(data, file = folder/data.csv2)
保存这个data.frame,然后用read.csv2(folder/data.csv2)
再次将其加载到R中时,它也具有相同的行数(3686);但是,当我用unique(data$Species)
询问物种(因子)的数量时,Environment中的数据表有708个水平,而我导入的数据表只有554个水平。
str(imported_dataframe$Species)
输出:因子w/554电平
str(Data_in_Environment$Species)
输出:因数w/708电平
有人能帮我吗?
写入CSV时,level属性丢失。您可以单独导出关卡并在data.frame中设置它们。
# Species is a factor with three levels
all_levels <- levels(iris$Species)
all_levels
# [1] "setosa" "versicolor" "virginica"
# export table where not all levels are present
write.csv2(head(iris), file = "iris_tmp.csv", row.names = FALSE)
# also export complete list of levels
cat(all_levels, file = "iris_levels_tmp.txt")
# import both levels and data
all_levs <- scan("iris_levels_tmp.txt", what = "")
iris6 <- read.csv2("iris_tmp.csv")
# unrepresented levels have been lost
levels(iris6$Species)
# [1] "setosa"
# define Species as factor with all levels
iris6$Species <- factor(iris6$Species, levels = all_levs)
或者您可以使用save
/load
导出R数据对象。
iris5 <- head(iris, n = 5)
save("iris5", file = "iris5.rda")
# load back iris5
load(file = "iris5.rda")
levels(iris5$Species)
# [1] "setosa" "versicolor" "virginica"
或者,您可以使用csvy
库并使用包含因子级别的yaml头文件导出csv文件:
# library load
library(csvy)
library(dplyr)
# relevel factos
iris_releveled = iris %>% mutate(Species = relevel(Species, "virginica","setosa","versicolor"))
# write csv file
write.csv2(iris_releveled,"iris_releveled.csv")
# load exported dataset
iris_relevel_loaded = read.csv2("iris_releveled.csv",stringsAsFactors = T)
# now factor levels are lost
iris_relevel_loaded$Species %>% levels()
# write CSVy file from dataset with releveled factors
write_csvy(iris_releveled, file = "iris_releveled.csvy")
# read csv file with original factor levels
iris_relevel_loaded = read_csvy("iris_releveled.csvy", stringsAsFactors = T)
# now factor levels are kept
iris_relevel_loaded$Species %>% levels()