从 R 的 Rd 文件访问元素?



我想浏览一个包,发现每个函数的帮助文件提到的作者是谁。

我寻找了一个从R的帮助文件中提取元素的函数,并找到了一个。我能找到的最接近的是诺姆·罗斯的这篇帖子。

是否存在这样的功能?(如果没有,我想我会破解Noam的代码,以便解析Rd文件,并找到我感兴趣的特定元素)。

谢谢,塔尔。

潜在代码示例:

get_field_from_r_help(topic="lm", field = "Description") #
# output:

"lm"用于拟合线性模型。它可以用来执行回归、单层次方差分析和协方差(尽管"aov"可能提供更方便的接口对于这些)。

Duncan Murdoch关于解析Rd文件的文档将很有帮助,这篇SO文章也是如此。

根据这些,你可能会尝试以下方法:

getauthors <- function(package){
    db <- tools::Rd_db(package)
    authors <- lapply(db,function(x) {
        tags <- tools:::RdTags(x)
        if("\author" %in% tags){
            # return a crazy list of results
            #out <- x[which(tmp=="\author")]
            # return something a little cleaner
            out <- paste(unlist(x[which(tags=="\author")]),collapse="")
        }
        else
            out <- NULL
        invisible(out)
        })
    gsub("n","",unlist(authors)) # further cleanup
}

然后我们可以在一两个包上运行这个:

> getauthors("knitr")
                                                                                     d:/RCompile/CRANpkg/local/3.0/knitr/man/eclipse_theme.Rd 
                                                                                                                     "  Ramnath Vaidyanathan" 
                                                                                         d:/RCompile/CRANpkg/local/3.0/knitr/man/image_uri.Rd 
                                                                                                                    "  Wush Wu and Yihui Xie" 
                                                                                      d:/RCompile/CRANpkg/local/3.0/knitr/man/imgur_upload.Rd 
                                                                              "  Yihui Xie, adapted from the imguR package by Aaron  Statham" 
                                                                                          d:/RCompile/CRANpkg/local/3.0/knitr/man/knit2pdf.Rd 
                                                                                         "  Ramnath Vaidyanathan, Alex Zvoleff and Yihui Xie" 
                                                                                           d:/RCompile/CRANpkg/local/3.0/knitr/man/knit2wp.Rd 
                                                                                                          "  William K. Morris and Yihui Xie" 
                                                                                        d:/RCompile/CRANpkg/local/3.0/knitr/man/knit_theme.Rd 
                                                                                                       "  Ramnath Vaidyanathan and Yihui Xie" 
                                                                                     d:/RCompile/CRANpkg/local/3.0/knitr/man/knitr-package.Rd 
                                                                                                            "  Yihui Xie <http://yihui.name>" 
                                                                                        d:/RCompile/CRANpkg/local/3.0/knitr/man/read_chunk.Rd 
                      "  Yihui Xie; the idea of the second approach came from  Peter Ruckdeschel (author of the SweaveListingUtils  package)" 
                                                                                       d:/RCompile/CRANpkg/local/3.0/knitr/man/read_rforge.Rd 
                                                                                                          "  Yihui Xie and Peter Ruckdeschel" 
                                                                                           d:/RCompile/CRANpkg/local/3.0/knitr/man/rst2pdf.Rd 
                                                                                                               "  Alex Zvoleff and Yihui Xie" 
                                                                                              d:/RCompile/CRANpkg/local/3.0/knitr/man/spin.Rd 
"  Yihui Xie, with the original idea from Richard FitzJohn  (who named it as sowsear() which meant to make a  silk purse out of a sow's ear)" 

也许工具

> getauthors("tools")
                       D:/murdoch/recent/R64-3.0/src/library/tools/man/bibstyle.Rd 
                                                                "  Duncan Murdoch" 
                   D:/murdoch/recent/R64-3.0/src/library/tools/man/checkPoFiles.Rd 
                                                                "  Duncan Murdoch" 
                        D:/murdoch/recent/R64-3.0/src/library/tools/man/checkRd.Rd 
                                                  "  Duncan Murdoch, Brian Ripley" 
                     D:/murdoch/recent/R64-3.0/src/library/tools/man/getDepList.Rd 
                                                                   " Jeff Gentry " 
                      D:/murdoch/recent/R64-3.0/src/library/tools/man/HTMLlinks.Rd 
                                                    "Duncan Murdoch, Brian Ripley" 
            D:/murdoch/recent/R64-3.0/src/library/tools/man/installFoundDepends.Rd 
                                                                     "Jeff Gentry" 
                D:/murdoch/recent/R64-3.0/src/library/tools/man/makeLazyLoading.Rd 
                                                   "Luke Tierney and Brian Ripley" 
                       D:/murdoch/recent/R64-3.0/src/library/tools/man/parse_Rd.Rd 
                                                                " Duncan Murdoch " 
                     D:/murdoch/recent/R64-3.0/src/library/tools/man/parseLatex.Rd 
                                                                  "Duncan Murdoch" 
                        D:/murdoch/recent/R64-3.0/src/library/tools/man/Rd2HTML.Rd 
                                                  "  Duncan Murdoch, Brian Ripley" 
                 D:/murdoch/recent/R64-3.0/src/library/tools/man/Rd2txt_options.Rd 
                                                                  "Duncan Murdoch" 
                   D:/murdoch/recent/R64-3.0/src/library/tools/man/RdTextFilter.Rd 
                                                                "  Duncan Murdoch" 
                D:/murdoch/recent/R64-3.0/src/library/tools/man/SweaveTeXFilter.Rd 
                                                                  "Duncan Murdoch" 
                       D:/murdoch/recent/R64-3.0/src/library/tools/man/texi2dvi.Rd 
                     "  Originally Achim Zeileis but largely rewritten by R-core." 
                  D:/murdoch/recent/R64-3.0/src/library/tools/man/tools-package.Rd 
"  Kurt Hornik and Friedrich Leisch  Maintainer: R Core Team R-core@r-project.org" 
                D:/murdoch/recent/R64-3.0/src/library/tools/man/vignetteDepends.Rd 
                                                                   " Jeff Gentry " 
                 D:/murdoch/recent/R64-3.0/src/library/tools/man/vignetteEngine.Rd 
                                            "Duncan Murdoch and Henrik Bengtsson." 
                  D:/murdoch/recent/R64-3.0/src/library/tools/man/writePACKAGES.Rd 
                                                        "  Uwe Ligges and R-core."

有些函数没有author字段,因此当它在getauthors末尾调用unlist时,只会删除这些字段,但可以稍微修改代码以返回这些字段的NULL值。

此外,进一步的解析将变得有点困难,因为包作者似乎以非常不同的方式使用这个字段。devtools中只有一个author字段。中有一堆,每个都包含一个电子邮件地址。等等。但这会让你获得可用的信息,你应该能够进一步处理这些信息。

注意:如果你有Rd文件的完整路径,我以前版本的这个答案提供了一个解决方案,但如果你试图为安装的软件包这样做,它就不起作用。按照泰勒的建议,我想出了一个更完整的解决方案。

这是我使用其他人提出的一些建议的方法:

package <- "qdap"
funs <- unclass(lsf.str(envir = asNamespace(package)))
out <- sapply(funs, function(x) {
    x <- try(capture.output(tools:::Rd2txt(utils:::.getHelpFile(as.character(help(x, help_type="text"))))))
    Auth_lines <- grep("_bA_bu_bt_bh_bo_br(_bs):", x, fixed = TRUE) 
    if (identical(Auth_lines, integer(0))) {
        return(NA)
    }
    gsub("^\s+|\s+$", "", x[Auth_lines +2])
})
## To look at just the ones with author fields:
out[!sapply(out, is.na)]
## > out[!sapply(out, is.na)]
##                                                         beg2char 
##                   "Josh O'Brien, Justin Haynes and Tyler Rinker" 
##                                                         bracketX 
##       "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                    bracketXtract 
##       "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                         char2end 
##                   "Josh O'Brien, Justin Haynes and Tyler Rinker" 
##                                                 cm_df.transcript 
## "DWin, Gavin Simpson and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                            gantt 
##           "DigEmAll (<URL: stackoverflow.com>) and Tyler Rinker" 
##                                                       gantt_wrap 
##     "Andrie de Vries and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                             genX 
##       "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                        genXtract 
##       "Martin Morgan and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                             hash 
##      "Bryan Goodrich and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                         name2sex 
##    "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                  read.transcript 
##      "Bryan Goodrich and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                      sentCombine 
##    "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                        sentSplit 
##    "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                              TOT 
##    "Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>." 
##                                                          v.outer 
##   "Vincent Zoonekynd and Tyler Rinker <tyler.rinker@gmail.com>." 

相关内容

  • 没有找到相关文章

最新更新