r-从data.table创建一个方形偏好矩阵



我正在尝试创建一个偏好或计数的正方形矩阵(实际上并不重要)。

假设我有以下data.table可以使用:

library(data.table)
segment=c("track","track","track","round","round","sprint","sprint","sprint","sprint")
athlete=c("gunnar","brandon","raphael","gunnar","ben","brandon","raphael","ben","gunnar")
time=c(54,56,57,23,25,15,16,16,17)
df <- data.table(athlete,segment,time)
df[,time_diff:=min(time)-time,by=segment]
df[,winner:=athlete[1],by=segment]
    athlete segment time time_diff  winner
 1:  gunnar   track   54         0  gunnar
 2: brandon   track   56        -2  gunnar
 3: raphael   track   57        -3  gunnar
 4: raphael   round   23         0 raphael
 5:     ben   round   25        -2 raphael
 6: brandon   round   28        -5 raphael
 7: brandon  sprint   15         0 brandon
 8: raphael  sprint   16        -1 brandon
 9:     ben  sprint   19        -4 brandon
10:  gunnar  sprint   26       -11 brandon
names <- unique(df$athlete)
[1] "gunnar"  "brandon" "raphael" "ben" 

现在,我想在运动员身上拥有一个方形矩阵,这表明他们对每首曲目的获胜者的时间,类似于此:

        gunnar  brandon  raphael  ben
gunnar     0     -11        0      0       
brandon   -2       0       -5      0
raphael   -3      -1        0      0
ben       -2      -4        0      0

在我的脑海中,我有一些想法可以解决这个问题,但似乎没有任何努力。我来自Matlab背景,我只是迭代了,但是我觉得这不是data.table的方法。

我觉得我应该能够在运动员上使用foreach迭代来完成它。沿着:

的线
foreach(n=1:length(names)) %do% df[athlete==names[n],.(time_diff, winner),by=segment][,.(pref=sum(time_diff)),by=winner]
[[1]]
    winner pref
1:  gunnar    0
2: brandon  -11
[[2]]
    winner pref
1:  gunnar   -2
2: raphael   -5
3: brandon    0
[[3]]
    winner pref
1:  gunnar   -3
2: raphael    0
3: brandon   -1
[[4]]
    winner pref
1: raphael   -2
2: brandon   -4

但是,在这一点上,我不确定如何进行。我有一些最初的想法,创建了批准的lenght vec <- vector(mode="double", length=length(names))的向量,然后使用 which(names %in% df[,winner,by=IREALLYDONTKNOW])进行索引,但是如您所见,我尚不清楚如何正确处理它。

如果有人会给我一些有关正确data.table方法的提示,我将非常感激。

运行代码时不会产生打印的表,我认为您正在寻找的是dcast.data.table

dt_compare <- dcast.data.table(df, athlete ~ winner, value.var = "time_diff")
# add zero columns for athletes that did not win
dt_compare[, setdiff(unique(athlete), names(dt_compare)) := 0]
# you can also reorder columns
setcolorder(dt_compare, c("athlete", dt_compare[["athlete"]]))

我解决的方式实际上很容易,经过一定的意识:

names <- unique(df$athlete)
vec <- matrix(data = 0,nrow=length(names),ncol=length(names),dimnames=list(names,names))
pref <- foreach(n=1:length(names)) %do% df[athlete==names[n],.(time_diff, winner),by=segment][,.(pref=sum(time_diff)),by=winner]
foreach(n=1:length(names)) %do% (vec[names[n],pref[[n]]$winner] <- pref[[n]]$pref)
> vec
        gunnar brandon raphael ben
gunnar       0     -11       0   0
brandon     -2       0      -5   0
raphael     -3      -1       0   0
ben          0      -4      -2   0

最新更新