这是我尝试在不同的数据集上执行的操作的示例,但这仍然不起作用
PORT STATUS VESSEL DWT IMP/EXP QTY (Mts)
1 KANDLA SAILED CAPTAIN HAMADA 7938 EXP 4500
2 KAKINADA EXPECTED CELON BREEZE IMP 30000
3 KAKINADA BERTH CELON BREEZE IMP 3000
4 KAKINADA SAILED CELON BREEZE IMP 30000
5 KANDLA ANCHORAGE CAPTAIN HAMADA EXP 4500
6 KAKINADA BERTH CELON BREEZE IMP 30000
我想将一行(端口,船只,IMP/EXP(与另一行进行比较,如果匹配则删除,例如如果行中的IMP/EXP是" IMP",则按优先级顺序删除该行状态: 预计已航行>泊位>锚地> 它将给予航行=状态和其他有锚定并删除第二行的最高优先级,因为它将数量,港口,船舶与第四行匹配。 依此类推,如果条件匹配,请参阅
1 ) status=sailed and other have berth ,it will delete berth row
2) sailed and other have expected,it will delete expected row
3)if some row have berth and other have anchorage will delete anchorage
4)if some has expected=STATUS & other row have sailed=STATUS it will delete
"expected"=STATUS row
等等 行应符合条件,即数量,端口,船舶根据条件删除行
对于 IMP/EXP 中的 EXP,它应该与条件匹配,即数量、港口、船舶
状态中的优先级条件:
priority- sailed>anchorage>expected> berth
输出应该是
PORT STATUS VESSEL DWT IMP/EXP QTY (Mts)
1 KANDLA SAILED CAPTAIN HAMADA 7938 EXP 4500
3 KAKINADA BERTH CELON BREEZE IMP 3000
4 KAKINADA SAILED CELON BREEZE IMP 30000
删除第 2、5、6 行是所需的输出
首先,您需要将数据读入 data.frame 中的 R 中。数据帧test
应如下所示:
>test
# PORT STATUS VESSEL DWT IMPEXP QTY
#1 KANDLA SAILED CAPTAIN HAMADA 7938 EXP 4500
#2 KAKINADA EXPECTED CELON BREEZE NA IMP 30000
#3 KAKINADA BERTH CELON BREEZE NA IMP 3000
#4 KAKINADA SAILED CELON BREEZE NA IMP 30000
#5 KANDLA ANCHORAGE CAPTAIN HAMADA NA EXP 4500
#6 KAKINADA BERTH CELON BREEZE NA IMP 30000
使用plyr
包的ddply
功能,您应该能够在跟随功能的帮助下获得所需的输出。
ddply(test,.variables = c("PORT","VESSEL","IMPEXP","QTY"),
function(t){if(t$IMPEXP[1]=="IMP"){
t$STATUS<-factor(x = t$STATUS,levels =c("EXPECTED","ANCHORAGE","BERTH","SAILED"),ordered = T)
return(t[which.max(as.integer(t$STATUS)),])
}else{
t$STATUS<-factor(x = t$STATUS,levels =c("BERTH","EXPECTED","ANCHORAGE","SAILED"),ordered = T)
return(t[which.max(as.integer(t$STATUS)),])}
}
)
#PORT STATUS VESSEL DWT IMPEXP QTY
#1 KAKINADA BERTH CELON BREEZE NA IMP 3000
#2 KAKINADA SAILED CELON BREEZE NA IMP 30000
#3 KANDLA SAILED CAPTAIN HAMADA 7938 EXP 4500