r语言 - 你可以打印超过11个协变量summary.estimateEffect吗? - r - Can you print more than 11 covariates for summary.estimateEffect? 小贝子编程网

我创建了一个stm主题模型，我在summary.estimateEffect上有问题，我有大约150天的时间，但是，它只打印10天的回归估计。

parlPrevFit<- stm(document = out$documents, vocab = out$vocab, K = 0, prevalence =~s(day),
max.em.its = 150, data = out$meta, init.type = "Spectral")
prep<- estimateEffect(c(14, 40, 5, 41)~s(day), parlPrevFit, meta = meta, uncertainty = "Global")
summary(prep, topics = c(14, 40, 5, 41))

主题14系数- https://prnt.sc/105pg1a

有谁能推荐一些关于如何打印超过10天的建议吗?

不要使用您无法控制的summary()，而是加载tidytext包并使用tidy()。

让我们看一个例子，我们训练一个关于简·奥斯汀小说的主题模型，文档是每个章节:

library(tidyverse)
library(tidytext)
library(stm)
#> stm v1.3.6 successfully loaded. See ?stm for help. 
#>  Papers, resources, and other materials at structuraltopicmodel.com
library(janeaustenr)
books <- austen_books() %>%
group_by(book) %>%
mutate(chapter = cumsum(str_detect(text, regex("^chapter ", ignore_case = TRUE)))) %>%
ungroup() %>%
filter(chapter > 0) %>%
unite(document, book, chapter, remove = FALSE)
austen_sparse <- books %>%
unnest_tokens(word, text) %>%
anti_join(stop_words) %>%
count(document, word) %>%
cast_sparse(document, word, n)
#> Joining, by = "word"

让我们用6个主题(有6本书)训练一个主题模型:

topic_model <- stm(
austen_sparse, 
K = 6,
init.type = "Spectral",
verbose = FALSE
)

让我们创建一个estimateEffect()使用的数据集:

chapters <- books %>%
group_by(document) %>% 
summarize(text = str_c(text, collapse = " ")) %>%
ungroup() %>%
inner_join(books %>%
distinct(document, book))
#> Joining, by = "document"
chapters
#> # A tibble: 269 x 3
#>    document text                                                           book 
#>    <chr>    <chr>                                                          <fct>
#>  1 Emma_1   "CHAPTER I   Emma Woodhouse, handsome, clever, and rich, with… Emma 
#>  2 Emma_10  "CHAPTER X   Though now the middle of December, there had yet… Emma 
#>  3 Emma_11  "CHAPTER XI   Mr. Elton must now be left to himself. It was n… Emma 
#>  4 Emma_12  "CHAPTER XII   Mr. Knightley was to dine with them--rather ag… Emma 
#>  5 Emma_13  "CHAPTER XIII   There could hardly be a happier creature in t… Emma 
#>  6 Emma_14  "CHAPTER XIV   Some change of countenance was necessary for e… Emma 
#>  7 Emma_15  "CHAPTER XV   Mr. Woodhouse was soon ready for his tea; and w… Emma 
#>  8 Emma_16  "CHAPTER XVI   The hair was curled, and the maid sent away, a… Emma 
#>  9 Emma_17  "CHAPTER XVII   Mr. and Mrs. John Knightley were not detained… Emma 
#> 10 Emma_18  "CHAPTER XVIII   Mr. Frank Churchill did not come. When the t… Emma 
#> # … with 259 more rows

现在让我们从我们的主题模型中估计回归，对于我们的前三个主题和我们的"章节"数据集。文件:

effects <- estimateEffect(1:3 ~ book, topic_model, chapters)
summary(effects)
#> 
#> Call:
#> estimateEffect(formula = 1:3 ~ book, stmobj = topic_model, metadata = chapters)
#> 
#> 
#> Topic 1:
#> 
#> Coefficients:
#>                        Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)            0.018033   0.023726   0.760    0.448    
#> bookPride & Prejudice  0.799555   0.037140  21.528   <2e-16 ***
#> bookMansfield Park    -0.006387   0.032662  -0.196    0.845    
#> bookEmma               0.003188   0.033393   0.095    0.924    
#> bookNorthanger Abbey   0.002535   0.039017   0.065    0.948    
#> bookPersuasion         0.025725   0.044281   0.581    0.562    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> 
#> Topic 2:
#> 
#> Coefficients:
#>                        Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)            0.015289   0.016478   0.928    0.354    
#> bookPride & Prejudice  0.001785   0.023489   0.076    0.939    
#> bookMansfield Park     0.001616   0.024664   0.066    0.948    
#> bookEmma               0.892516   0.037833  23.591   <2e-16 ***
#> bookNorthanger Abbey   0.006032   0.031530   0.191    0.848    
#> bookPersuasion        -0.001142   0.030052  -0.038    0.970    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> 
#> Topic 3:
#> 
#> Coefficients:
#>                         Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)            0.0196151  0.0225115   0.871   0.3844    
#> bookPride & Prejudice -0.0004909  0.0286302  -0.017   0.9863    
#> bookMansfield Park     0.0148960  0.0341272   0.436   0.6628    
#> bookEmma              -0.0004006  0.0301741  -0.013   0.9894    
#> bookNorthanger Abbey   0.8730570  0.0457994  19.063   <2e-16 ***
#> bookPersuasion         0.1030537  0.0495148   2.081   0.0384 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

这个例子没有您提到的打印限制的问题，但是您可以通过使用tidy()来避免任何类似的问题，在那里您可以获得回归的实际内容:

tidy(effects)
#> # A tibble: 18 x 6
#>    topic term                   estimate std.error statistic  p.value
#>    <int> <chr>                     <dbl>     <dbl>     <dbl>    <dbl>
#>  1     1 (Intercept)            0.0179      0.0238    0.753  4.52e- 1
#>  2     1 bookPride & Prejudice  0.799       0.0373   21.4    1.09e-59
#>  3     1 bookMansfield Park    -0.00614     0.0325   -0.189  8.50e- 1
#>  4     1 bookEmma               0.00350     0.0336    0.104  9.17e- 1
#>  5     1 bookNorthanger Abbey   0.00323     0.0394    0.0820 9.35e- 1
#>  6     1 bookPersuasion         0.0253      0.0443    0.571  5.68e- 1
#>  7     2 (Intercept)            0.0153      0.0165    0.925  3.56e- 1
#>  8     2 bookPride & Prejudice  0.00165     0.0234    0.0707 9.44e- 1
#>  9     2 bookMansfield Park     0.00167     0.0246    0.0680 9.46e- 1
#> 10     2 bookEmma               0.892       0.0381   23.4    2.84e-66
#> 11     2 bookNorthanger Abbey   0.00606     0.0317    0.191  8.49e- 1
#> 12     2 bookPersuasion        -0.00107     0.0298   -0.0359 9.71e- 1
#> 13     3 (Intercept)            0.0197      0.0228    0.864  3.89e- 1
#> 14     3 bookPride & Prejudice -0.000835    0.0288   -0.0290 9.77e- 1
#> 15     3 bookMansfield Park     0.0147      0.0342    0.428  6.69e- 1
#> 16     3 bookEmma              -0.000707    0.0305   -0.0232 9.82e- 1
#> 17     3 bookNorthanger Abbey   0.873       0.0461   18.9    4.93e-51
#> 18     3 bookPersuasion         0.103       0.0496    2.08   3.85e- 2

^{由reprex包(v1.0.0)创建于2021-02-26}

r语言 - 你可以打印超过11个协变量summary.estimateEffect吗?

相关内容

最新更新

热门标签：