这是我的数据集的一个小样本:
Genes Cell_Line Value
myc A1 233
myc A2 213
myc A3 541
erk A1 123
erk A2 245
erk A3 123
我想取第一列,然后将唯一的基因名称作为值
的列的例子:
Cell_Line myc erk
A1 233 123
A2 213 245
A3 541 123
我正在尝试围绕dplyr包和组功能导航。我不确定这是否有效。
# Load the reshape2 package
library(reshape2)
# Load the dataset into a data frame
data <- data.frame(
Genes = c("myc", "myc", "myc", "erk", "erk", "erk"),
Cell_Line = c("A1", "A2", "A3", "A1", "A2", "A3"),
Value = c(233, 213, 541, 123, 245, 123)
)
# Use the dcast() function to pivot the data frame and reshape it
new_data <- dcast(data, Cell_Line ~ Genes, value.var = "Value")
# Print the new dataset
print(new_data)
结果
Cell_Line erk myc
1 A1 123 233
2 A2 245 213
3 A3 123 541
tidyverse道:
# Use tidyr function to pivot the data frame
library(tidyr)
new_data <- data %>%
pivot_wider(names_from = Genes, values_from = Value)
Cell_Line myc erk
<chr> <dbl> <dbl>
1 A1 233 123
2 A2 213 245
3 A3 541 123