我是R的新手。我试图用";openNLP";包(请注意,"udpipe"在我的环境中不起作用(。我有一个句子和下面的标签混在一起。
"执行/VBG工作/NN为/IN始终/RB./。踢/VBG足球/NN是/VBZ好/JJ./。I/PRP do/VBP that/IN";
如何在没有POS标签的情况下实现动词?在这个例子中,我试图得到的答案是
"做"播放"是"做";
您请求的示例:
x <- "Doing/VBG work/NN as/IN always/RB ./. playing/VBG soccer/NN is/VBZ good/JJ ./. I/PRP do/VBP that/IN"
x <- strsplit(x, split = " ")
x <- unlist(x)
x <- lapply(x, FUN = function(data){
x <- strsplit(data, split = "\/")
x <- unlist(x)
data.frame(token = x[1], xpos = x[2], stringsAsFactors = FALSE)
})
x <- do.call(rbind, x)
subset(x, xpos %in% c("VB","VBD","VBG","VBN","VBP","VBZ"))
使用udpipe
library(udpipe)
txt <- c(doc1 = "Doing work as always. playing soccer is good. I do that")
x <- udpipe(txt, object = "english", udpipe_model_repo = "bnosac/udpipe.models.ud", trace = 100)
subset(x, xpos %in% c("VB","VBD","VBG","VBN","VBP","VBZ"))
> subset(x, xpos %in% c("VB","VBD","VBG","VBN","VBP","VBZ"))
doc_id paragraph_id sentence_id sentence start end term_id token_id token lemma upos xpos
1 doc1 1 1 Doing work as always. 1 5 1 1 Doing do VERB VBG
6 doc1 1 2 playing soccer is good. 23 29 6 1 playing play VERB VBG
8 doc1 1 2 playing soccer is good. 38 39 8 3 is be AUX VBZ
12 doc1 1 3 I do that 49 50 12 2 do do VERB VBP
feats head_token_id dep_rel deps misc
1 VerbForm=Ger 0 root <NA> <NA>
6 VerbForm=Ger 4 csubj <NA> <NA>
8 Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 4 cop <NA> <NA>
12 Mood=Ind|Tense=Pres|VerbForm=Fin 0 root <NA> <NA>