运行cor()时找不到对象



我试图从r中的组合数据集找到两列(sunshine_in_hours和AgeGroup_30_to_34)之间的相关性。然而,每次我尝试运行cor()函数时,我只是最终得到此错误:

Error in pmatch(use, c("all.obs", "complete.obs", "pairwise.complete.obs",  : 
  object 'AgeGroup_30_to_34' not found

下面是输出(头部)代码片段:

structure(list(Date = structure(c(18659, 18660, 18661, 18663, 
18665, 18666, 18667, 18668, 18669, 18670, 18671, 18673, 18674, 
18675, 18676, 18677, 18678, 18679, 18680, 18681, 18682, 18683, 
18684, 18685, 18686, 18687, 18688, 18689, 18690, 18691), class = "Date"), 
    Year = c(2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 
    2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 
    2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 
    2021, 2021), Month = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
    2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3), AgeGroup_30_to_34 = c(0, 
    0, 0, 2, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 
    2, 0, 0, 1, 2, 0, 3, 0, 0, 0), Sunshine_in_hours = c(1.6, 
    3.4, 13.1, 8.9, 2, 1.7, 12.7, 11.6, 5.5, 5.6, 4.9, 9.2, 8.3, 
    11.9, 12.4, 12.4, 5.9, 0, 6.3, 8.5, 9.9, 8.7, 6.3, 1, 9.2, 
    6.3, 1.4, 2.1, 2.6, 3.6), City = c("Melbourne", "Melbourne", 
    "Melbourne", "Melbourne", "Melbourne", "Melbourne", "Melbourne", 
    "Melbourne", "Melbourne", "Melbourne", "Melbourne", "Melbourne", 
    "Melbourne", "Melbourne", "Melbourne", "Melbourne", "Melbourne", 
    "Melbourne", "Melbourne", "Melbourne", "Melbourne", "Melbourne", 
    "Melbourne", "Melbourne", "Melbourne", "Melbourne", "Melbourne", 
    "Melbourne", "Melbourne", "Melbourne")), row.names = c(NA, 
-30L), class = c("tbl_df", "tbl", "data.frame"))

我试着运行代码:

Combined <- inner_join(covidS, weatherS, by = 'Date')%>%
  mutate(Date = mdy(Date),
         Year = year(Date),
         Month = month(Date),
         Day = day(Date))%>%
  select(Date, Year, Month, AgeGroup_30_to_34, Sunshine_in_hours, City)%>%
  filter(City == 'Melbourne')%>%
  cor(Sunshine_in_hours, AgeGroup_30_to_34 )

我试着查找教程如何做到这一点,但我一直遇到墙。如有任何帮助,不胜感激。

cor接受两个输入,你给它3个,其中两个它不理解。试试这个:

Combined <- inner_join(covidS, weatherS, by = 'Date')%>%
  mutate(Date = mdy(Date),
         Year = year(Date),
         Month = month(Date),
         Day = day(Date))%>%
  select(Date, Year, Month, AgeGroup_30_to_34, Sunshine_in_hours, City)%>%
  filter(City == 'Melbourne') 
corr = cor(Combined$Sunshine_in_hours, Combined$AgeGroup_30_to_34 )

记住,当你使用管道时,你将最后一个对象作为你调用的函数的第一个参数。在这种情况下,您的代码相当于:

cor(inner_join(covidS, weatherS, by = 'Date')%>%
  mutate(Date = mdy(Date),
         Year = year(Date),
         Month = month(Date),
         Day = day(Date))%>%
  select(Date, Year, Month, AgeGroup_30_to_34, Sunshine_in_hours, City)%>%
  filter(City == 'Melbourne'),
Sunshine_in_hours, AgeGroup_30_to_34 )

所以Sunshine_in_hoursAgeGroup_30_to_34都没有意义,如果函数不知道这些是来自另一个数据帧的列。问题是,这个函数是为基础R编写的,剩下的编程是dplyr,它们是不同的范例。在有疑问的时候一定要检查文档

使用magrittr公开管道%$%代替%>%,您可以这样做:

library(magrittr)
dat %$%
  cor(Sunshine_in_hours, AgeGroup_30_to_34)
#> [1] -0.0006941058

最新更新