我想浏览一个包,发现每个函数的帮助文件提到的作者是谁。
我寻找了一个从R的帮助文件中提取元素的函数,并找到了一个。我能找到的最接近的是诺姆·罗斯的这篇帖子。
是否存在这样的功能?(如果没有,我想我会破解Noam的代码,以便解析Rd文件,并找到我感兴趣的特定元素)。
谢谢,塔尔。
潜在代码示例:
get_field_from_r_help(topic="lm", field = "Description") #
# output:
"lm"用于拟合线性模型。它可以用来执行回归、单层次方差分析和协方差(尽管"aov"可能提供更方便的接口对于这些)。
Duncan Murdoch关于解析Rd文件的文档将很有帮助,这篇SO文章也是如此。
根据这些,你可能会尝试以下方法:
getauthors <- function(package){
db <- tools::Rd_db(package)
authors <- lapply(db,function(x) {
tags <- tools:::RdTags(x)
if("\author" %in% tags){
# return a crazy list of results
#out <- x[which(tmp=="\author")]
# return something a little cleaner
out <- paste(unlist(x[which(tags=="\author")]),collapse="")
}
else
out <- NULL
invisible(out)
})
gsub("n","",unlist(authors)) # further cleanup
}
然后我们可以在一两个包上运行这个:
> getauthors("knitr")
d:/RCompile/CRANpkg/local/3.0/knitr/man/eclipse_theme.Rd
" Ramnath Vaidyanathan"
d:/RCompile/CRANpkg/local/3.0/knitr/man/image_uri.Rd
" Wush Wu and Yihui Xie"
d:/RCompile/CRANpkg/local/3.0/knitr/man/imgur_upload.Rd
" Yihui Xie, adapted from the imguR package by Aaron Statham"
d:/RCompile/CRANpkg/local/3.0/knitr/man/knit2pdf.Rd
" Ramnath Vaidyanathan, Alex Zvoleff and Yihui Xie"
d:/RCompile/CRANpkg/local/3.0/knitr/man/knit2wp.Rd
" William K. Morris and Yihui Xie"
d:/RCompile/CRANpkg/local/3.0/knitr/man/knit_theme.Rd
" Ramnath Vaidyanathan and Yihui Xie"
d:/RCompile/CRANpkg/local/3.0/knitr/man/knitr-package.Rd
" Yihui Xie <http://yihui.name>"
d:/RCompile/CRANpkg/local/3.0/knitr/man/read_chunk.Rd
" Yihui Xie; the idea of the second approach came from Peter Ruckdeschel (author of the SweaveListingUtils package)"
d:/RCompile/CRANpkg/local/3.0/knitr/man/read_rforge.Rd
" Yihui Xie and Peter Ruckdeschel"
d:/RCompile/CRANpkg/local/3.0/knitr/man/rst2pdf.Rd
" Alex Zvoleff and Yihui Xie"
d:/RCompile/CRANpkg/local/3.0/knitr/man/spin.Rd
" Yihui Xie, with the original idea from Richard FitzJohn (who named it as sowsear() which meant to make a silk purse out of a sow's ear)"
也许工具:
> getauthors("tools")
D:/murdoch/recent/R64-3.0/src/library/tools/man/bibstyle.Rd
" Duncan Murdoch"
D:/murdoch/recent/R64-3.0/src/library/tools/man/checkPoFiles.Rd
" Duncan Murdoch"
D:/murdoch/recent/R64-3.0/src/library/tools/man/checkRd.Rd
" Duncan Murdoch, Brian Ripley"
D:/murdoch/recent/R64-3.0/src/library/tools/man/getDepList.Rd
" Jeff Gentry "
D:/murdoch/recent/R64-3.0/src/library/tools/man/HTMLlinks.Rd
"Duncan Murdoch, Brian Ripley"
D:/murdoch/recent/R64-3.0/src/library/tools/man/installFoundDepends.Rd
"Jeff Gentry"
D:/murdoch/recent/R64-3.0/src/library/tools/man/makeLazyLoading.Rd
"Luke Tierney and Brian Ripley"
D:/murdoch/recent/R64-3.0/src/library/tools/man/parse_Rd.Rd
" Duncan Murdoch "
D:/murdoch/recent/R64-3.0/src/library/tools/man/parseLatex.Rd
"Duncan Murdoch"
D:/murdoch/recent/R64-3.0/src/library/tools/man/Rd2HTML.Rd
" Duncan Murdoch, Brian Ripley"
D:/murdoch/recent/R64-3.0/src/library/tools/man/Rd2txt_options.Rd
"Duncan Murdoch"
D:/murdoch/recent/R64-3.0/src/library/tools/man/RdTextFilter.Rd
" Duncan Murdoch"
D:/murdoch/recent/R64-3.0/src/library/tools/man/SweaveTeXFilter.Rd
"Duncan Murdoch"
D:/murdoch/recent/R64-3.0/src/library/tools/man/texi2dvi.Rd
" Originally Achim Zeileis but largely rewritten by R-core."
D:/murdoch/recent/R64-3.0/src/library/tools/man/tools-package.Rd
" Kurt Hornik and Friedrich Leisch Maintainer: R Core Team R-core@r-project.org"
D:/murdoch/recent/R64-3.0/src/library/tools/man/vignetteDepends.Rd
" Jeff Gentry "
D:/murdoch/recent/R64-3.0/src/library/tools/man/vignetteEngine.Rd
"Duncan Murdoch and Henrik Bengtsson."
D:/murdoch/recent/R64-3.0/src/library/tools/man/writePACKAGES.Rd
" Uwe Ligges and R-core."
有些函数没有author字段,因此当它在getauthors
末尾调用unlist
时,只会删除这些字段,但可以稍微修改代码以返回这些字段的NULL
值。
此外,进一步的解析将变得有点困难,因为包作者似乎以非常不同的方式使用这个字段。devtools中只有一个author字段。车中有一堆,每个都包含一个电子邮件地址。等等。但这会让你获得可用的信息,你应该能够进一步处理这些信息。
注意:如果你有Rd文件的完整路径,我以前版本的这个答案提供了一个解决方案,但如果你试图为安装的软件包这样做,它就不起作用。按照泰勒的建议,我想出了一个更完整的解决方案。
这是我使用其他人提出的一些建议的方法:
package <- "qdap"
funs <- unclass(lsf.str(envir = asNamespace(package)))
out <- sapply(funs, function(x) {
x <- try(capture.output(tools:::Rd2txt(utils:::.getHelpFile(as.character(help(x, help_type="text"))))))
Auth_lines <- grep("_bA_bu_bt_bh_bo_br(_bs):", x, fixed = TRUE)
if (identical(Auth_lines, integer(0))) {
return(NA)
}
gsub("^\s+|\s+$", "", x[Auth_lines +2])
})
## To look at just the ones with author fields:
out[!sapply(out, is.na)]
## > out[!sapply(out, is.na)]
## beg2char
## "Josh O'Brien, Justin Haynes and Tyler Rinker"
## bracketX
## "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>."
## bracketXtract
## "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>."
## char2end
## "Josh O'Brien, Justin Haynes and Tyler Rinker"
## cm_df.transcript
## "DWin, Gavin Simpson and Tyler Rinker <tyler.rinker@gmail.com>."
## gantt
## "DigEmAll (<URL: stackoverflow.com>) and Tyler Rinker"
## gantt_wrap
## "Andrie de Vries and Tyler Rinker <tyler.rinker@gmail.com>."
## genX
## "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>."
## genXtract
## "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>."
## hash
## "Bryan Goodrich and Tyler Rinker <tyler.rinker@gmail.com>."
## name2sex
## "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>."
## read.transcript
## "Bryan Goodrich and Tyler Rinker <tyler.rinker@gmail.com>."
## sentCombine
## "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>."
## sentSplit
## "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>."
## TOT
## "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>."
## v.outer
## "Vincent Zoonekynd and Tyler Rinker <tyler.rinker@gmail.com>."