我正在尝试在R:中运行这个小脚本
minimumFrequency <- 10
datadtm <- DocumentTermMatrix(datacorpusclean, control=list(bounds = list(global=c(1, Inf
)), weighting = weightBin))
# convert dtm into sparse matrix
datasdtm <- Matrix::sparseMatrix(i = datadtm$i, j = datadtm$j,
x = datadtm$v,
dims = c(datadtm$nrow, datadtm$ncol),
dimnames = dimnames(datadtm))
# calculate co-occurrence counts
coocurrences <- t(datasdtm) %*% datasdtm
# convert into matrix
collocates <- as.matrix(coocurrences)
source("https://slcladal.github.io/rscripts/calculateCoocStatistics.R")
coocTerm <- "selection"
# calculate co-occurence statistics
coocs <- calculateCoocStatistics(coocTerm, datasdtm, measure="LOGLIK")
但在最后一行,我得到了这个错误:
intI错误(j,n=x@Dim[2],dn[[2]],give.dn=FALSE(:无效的字符索引。
我不是R方面的专家,有人能解释一下为什么会发生这种情况吗?它到底是什么意思?
这意味着您试图以某种方式提取一个不存在的列。这里有一种重现这个问题的方法:
library(Matrix)
dd <- data.frame(a = gl(3,4), b = gl(4,1,12))# balanced 2-way
options("contrasts") # the default: "contr.treatment"
x <- sparse.model.matrix(~ a + b, dd)
x[,"a2"] # works
# 1 2 3 4 5 6 7 8 9 10 11 12
# 0 0 0 0 1 1 1 1 0 0 0 0
x[,"fails"] # fails
#Error in intI(j, n = x@Dim[2], dn[[2]], give.dn = FALSE) :
# invalid character indexing