将by()函数的结果(列表)转换为R-Edited中的数据帧

我有这样的数据：

is_severe     encoding    sn_id
6           1        1   chr1 17689
7           0        2   chr1 17689
8           1        1   chr1 17689
9           1        2   chr1 69511
10          1        2   chr1 69511
11          1        1   chr1 69511
12          0        1   chr1 69511

我对每个"；组"；基于sn_id列的值的。

这是统计测试的函数：

catt <-
function(y, x, score = c(0, 1, 2)) {
miss <- unique(c(which(is.na(y)), which(is.na(x))))
n.miss <- length(miss)
if(n.miss > 0) {
y <- y[-miss]
x <- x[-miss]
}
if(!all((y == 0) | (y == 1))) 
stop("y should be only 0 or 1.")
if(!all((x == 0) | (x == 1) |(x == 2))) 
stop("x should be only 0, 1 or 2.")
ca <- x [y == 1]
co <- x [y == 0]
htca <- table(ca)
htco <- table(co)
A <- matrix(0, 2, 3)
colnames(A) <- c(0, 1, 2)
rownames(A) <- c(0, 1)
A[1, names(htca)] <- htca
A[2, names(htco)] <- htco
ptt <- prop.trend.test(A[1, ], colSums(A), score = score) 
p.value = as.numeric(ptt$p.value) 
res=p.value
return(res)}

并且我使用by函数在snp_id的组上执行它：

send=by(merged_df_normal,merged_df_normal$snp_id, function (merged_df_normal) {catt(merged_df_normal$is_sever_int,merged_df_normal$encoding)})

并得到了这些结果，例如：

merged_df_normal$snp_id: chr11441806
[1] 0.6274769
--------------------------------------------------------------------- 
merged_df_normal$snp_id: chr1144192891
[1] NA

我想把它转换成一个数据帧，它看起来像这样：

snp_id                     pvalue
chr11441806                  0.6274769
chr1144192891                 NA

我试过这个：

do.call(rbind,list(send)

它返回了一个矩阵看起来是这样的：

chr11441806      chr1144192891         
0.6274769          NA

在接受答案后，我不得不编辑函数：

catt_2 <-
function(y, x, score = c(0, 1, 2)) {
miss <- unique(c(which(is.na(y)), which(is.na(x))))
n.miss <- length(miss)
if(n.miss > 0) {
y <- y[-miss]
x <- x[-miss]
}
if(!all((y == 0) | (y == 1))) 
stop("y should be only 0 or 1.")
if(!all((x == 0) | (x == 1) |(x == 2))) 
stop("x should be only 0, 1 or 2.")
ca <- x [y == 1]
co <- x [y == 0]
htca <- table(ca)
htco <- table(co)
A <- matrix(0, 2, 3)
colnames(A) <- c(0, 1, 2)
rownames(A) <- c(0, 1)
A[1, names(htca)] <- htca
A[2, names(htco)] <- htco
ptt <- prop.trend.test(A[1, ], colSums(A), score = score)
res <- list(
chisq = as.numeric(ptt$statistic), 

p.value = as.numeric(ptt$p.value)
)
return(res)
}

现在的结果是：

send=by(merged_df_normal,merged_df_normal$snp_id, function (merged_df_normal) {catt_2(merged_df_normal$is_sever,merged_df_normal$encoding)})

merged_df_normal$snp_id: chr11007252
$chisq
[1] NA
$p.value
[1] NA
------------------------------------------------------------------------ 
merged_df_normal$snp_id: chr1100731820
$chisq
[1] 0.9111779
$p.value
[1] 0.3398021

我希望它是：

snp_id                     pvalue                chisq         
chr11441806                  0.6274769       0.9111779
chr1144192891                 NA              NA

答案：

library(data.table)
setDT(merged_df_normal)
merged_df_normal[,.(p.value=catt(is_sever,encoding)),snp_id]

只得到p值非常有效，但有没有办法编辑上面的答案并添加一个新的列chisq？感谢您对上次回答的帮助

我相信您可以将catt()应用于每组sn_id。假设您的原始数据称为df。然后，您可以执行以下操作：

library(data.table)
setDT(df)
df[,.(p.value=catt(is_severe,encoding)),sn_id]

您需要调整您的函数，以便它处理没有足够数据的sn_id组；在您的示例数据帧中，catt()仅在sn_id == chr1 69511上运行而没有错误。。

然而，通常情况下，输出看起来是这样的，每个sn_id值在帧中有一行

sn_id   p.value
<char>     <num>
1: chr1 69511 0.2482131

相关内容

最新更新

热门标签：