r-使用扫描处理多个文本文件

我有一个适用于我的代码（它来自Jockers‘s Text Analysis with R for Students of Literature）。然而，我需要能够做到的是自动化：我需要对多达30个单独的文本文件执行"ProcessingSection"。我该怎么做？是否可以为每个scan("*.txt")提供一个包含30次"text.v"的表或数据帧？

非常感谢您的帮助！

# Chapter 5 Start up code
setwd("D:/work/cpd/R/Projects/5/")
text.v <- scan("pupil-14.txt", what="character", sep="n")
length(text.v)

#ProcessingSection
text.lower.v <- tolower(text.v)
mars.words.l <- strsplit(text.lower.v, "\W")
mars.word.v <- unlist(mars.words.l)
#remove blanks
not.blanks.v <- which(mars.word.v!="")
not.blanks.v
#create a new vector to store the individual words
mars.word.v <- mars.word.v[not.blanks.v]
mars.word.v

很难提供帮助，因为您的示例不可复制。

承认你对CCD_ 2的结果感到满意，您可以将这部分代码转换为接受单个参数的函数，扫描结果。

processing_section <- function(x){
  unlist(strsplit(tolower(x), "\W"))
}

然后，如果所有.txt文件都在当前工作目录中，您应该能够列出它们，并将此功能应用于：

lf <- list.files(pattern=".txt")
lapply(lf, function(path) processing_section(scan(path, what="character", sep="n")))

这是你想要的吗？

相关内容

最新更新

热门标签：