如何在R数据框架中将一列划分为多列

我一直在寻找答案，但还没有想出一个解决方案。

我正试图将我的数据框架的多个(~60)列(物种计数)除以数据框架中的单个列(样本努力单位)

我能够想出下面的解决方案-但它比我想要的更混乱。正如现在所写的那样，我可能会意外地将最后一行代码运行两次，并通过两次除法混淆我的值。

下面是一个简短的示例，我演示了我使用的解决方案。有什么更干净的建议吗?

#short data.frame with some count data
#Hours is the sampling effort

counts=data.frame(sp1=sample(1:10,10),sp2=sample(1:10,10),
         sp3=sample(1:10,10),sp4=sample(1:10,10),
         Hours=rnorm(10,4,1))

#get my 'species' names
names=colnames(counts)[1:4]
#This seems messy: and if I run the second line twice, I will screw up my values. I want to divide all 'sp' columns by the single 'Hours' column
rates=counts
rates[names]=rates[,names]/rates[,'Hours']

注。:我一直在使用%>%，所以如果有人有一个解决方案，我可以只是转换'count'数据帧而不创建一个新的数据帧，那将是膨胀!

注。我怀疑哈德利的某个职能部门可能有我需要的东西。mutate_each?)，但我一直没能弄清楚。

我真的看不出你的基本R方法有什么问题，它非常干净。如果担心意外地多次运行第二行而不运行第一行，只需引用原始的counts列，如下所示。我会做一些小小的调整，像这样:

rates = counts
rates[names] = counts[names] / counts[["Hours"]]

使用[和[[可以保证数据类型，而不管names的长度。

我确实喜欢dplyr，但对于这个来说似乎更混乱:

# This works if you want everything except the Hours column
rates = counts %>% mutate_each(funs(./Hours), vars = -Hours)
# This sort of works if you want to use the names vector
rates = counts %>% mutate_at(funs(./Hours), .cols = names)

相关内容

最新更新

热门标签：