如何迭代执行扩展



我正在编写一块R代码并被卡住。

背景(解决问题不是必需的):我通过乘以独立的边际分布来计算关节概率。边缘概率向量是由probgenerationProcess()迭代生成的。在每次迭代中,它将输出一个向量,例如

Iteration 1:
Color =
   Blue  Green
   0.2    0.8   
Iteration 2:
Material =
   Cotton  Silk
    0.7     0.3
Iteration 3:
Country =
   China     USA
    0.6      0.4
......

所需的结果:我希望结果的关节概率是每个边缘向量中每个元素的乘积。格式应该看起来像这样。

Color   Material  Country   Prob
Blue    Cotton     China    0.084  (= 0.2*0.7*0.6)
Blue    Cotton     USA      0.056  (= 0.2*0.7*0.4)
Blue    Silk       China    0.036  (= 0.2*0.3*0.6)
Blue    Silk       USA      ..
Green   Cotton     China    ..
Green   Cotton     USA      ..
...     ...        ...      ...

我的实现:这是我的代码:

joint.names = NULL  # data.from store the marginal value names
joint.probs = NULL  # store probabilities.
for (i in iterations) {
    marginal = ProbGenerationProcess(VarUniqueToIteration) # output is numeric with names
    if ( is.null(joint.names) ) {
        # initialize the dataframes
        joint.names = names(marginal)
        joint.probs = marginal
    } else {
        # (my hope:) iteratively populate the joint.names and joint.probs
        joint.names = expand.grid(joint.names, names(marginal))
        expanded.prob = expand.grid(joint.probs, marginal)
        joint.probs = expanded.prob$Var1 * expanded.prob$Var2 # Row-by-row multiplication.
    }
}

输出: intern.probs投票始终是正确的,但是,关节。名称并不能按照我想要的方式工作。在前两个迭代之后,一切都很好。我得到了:

joint.names = 
    Var1  Var2
1   Blue  Cotton
2   Green Cotton
3   Blue  Silk
4   Green Silk 
    ...   ...

从第三次迭代开始,它变得有问题:

joint.names =
    Var1.Var1  Var1.Var2  Var1.Var1.1  Var1.Var2.1  Var2
1   Blue       Cotton     Blue         Cotton       China 
2   Green      Cotton     Green        Cotton       China
3   Blue       Silk       Blue         Silk         USA
4   Green      Silk       Green        Silk         USA

我想我的第一个问题是:这是获得我想要的结果的最有效方法吗?如果是这样,我应该使用的功能是expliv.grid(),我应该如何正确初始化它?

任何帮助都将受到赞赏!

合并是您的朋友。

color <- data.frame(color=c("blue","green"),prob=c(0.2,0.8))
material <- data.frame(material=c("cotton","silk"),prob=c(0.7,0.3))
country <- data.frame(country=c("china","usa"),prob=c(0.6,0.4))
dat <- merge(merge(color[1],material[1]),country[1]) # get names first
# same as: expand.grid(c("china","usa"),c("cotton","silk"),c("blue","green"))
dat <- merge(dat, color, by="color")
dat <- merge(dat, material, by="material")
dat <- merge(dat, country, by="country")
dat$joint <- dat$prob.x * dat$prob.y * dat$prob # joint calc
dat <- dat[-grep("^prob",colnames(dat))] # cleanup extra probs

结果:

  country material color joint
1   china   cotton  blue 0.084
2   china   cotton green 0.336
3   china     silk  blue 0.036
4   china     silk green 0.144
5     usa   cotton  blue 0.056
6     usa   cotton green 0.224
7     usa     silk  blue 0.024
8     usa     silk green 0.096

simality呢

PROBS<-data.frame(Item=rep(c("Color","Material","Country"),each=2),
           Value=c("Blue","Green","Cotton","Silk","China","USA"),
           Prob=c(0.2,0.8,0.7,0.3,0.6,0.4))
rownames(PROBS)<-PROBS$Value
GRID<-expand.grid(by(PROBS,PROBS$Item,function(x)x["Value"]))
GRID$probs<-apply(GRID,1,function(x)prod(PROBS[c(x),"Prob"]))
GRID
#  Color Country Material probs
#1  Blue   China   Cotton 0.084
#2 Green   China   Cotton 0.336
#3  Blue     USA   Cotton 0.056
#4 Green     USA   Cotton 0.224
#5  Blue   China     Silk 0.036
#6 Green   China     Silk 0.144
#7  Blue     USA     Silk 0.024
#8 Green     USA     Silk 0.096

最新更新