r语言 - 子集化 data.frame 中的特定列和行 - 错误消息"unexpected symbol in..." - r - Subsetting specific columns and rows from a data.frame - error message "unexpected symbol in..." 小贝子编程网

我是一名初学者，学习如何从R中的数据集中子集特定行和列。当我尝试选择指定的列时，我会收到以下错误消息：

library(dplyr)
library(tibble)
select(state.x77, Income, HS Grad)
Error: unexpected symbol in "select(state.x77, Income, HS Grad"

我不明白那行代码中的符号是不正确的。

另外，如果除了选择某些列（变量）之外，我还尝试过滤某个状态，那么当状态列表为行名时，我如何使用过滤器函数？当我尝试时：

rownames_to_column(state.x77, var = "State")

它为状态名称创建一个称为状态的列，但是当我转到查看状态时，它似乎不是永久的。x77（因此我无法使用过滤器函数）。

我很抱歉，我是一个初学者。任何帮助将不胜感激。

谢谢。

有两个问题。首先，state.x77是一个矩阵，因此您需要将其转换为数据框架，因为select来自dplyr软件包的函数仅将数据框架作为第一个参数。其次，如果列名中有空格，则有必要使用``或"包装列名。

# Load package
library(dplyr)
# Show the class of state.x77
class(state.x77)
# [1] "matrix"
# Convert state.x77 to a data frame
state.x77_df <- as.data.frame(state.x77)
# Show the class of state.x77_df
class(state.x77_df)
[1] "data.frame"
# Select Income and `HS Grad` columns
# All the following will work
select(state.x77_df, Income, `HS Grad`)
select(state.x77_df, "Income", "HS Grad")
select(state.x77_df, c("Income", "HS Grad"))

对于第二个问题，您必须将输出保存回对象。

library(tibble)
state.x77_df <- rownames_to_column(state.x77_df,  var = "State")
head(state.x77_df) 
       State Population Income Illiteracy Life Exp Murder HS Grad Frost   Area
1    Alabama       3615   3624        2.1    69.05   15.1    41.3    20  50708
2     Alaska        365   6315        1.5    69.31   11.3    66.7   152 566432
3    Arizona       2212   4530        1.8    70.55    7.8    58.1    15 113417
4   Arkansas       2110   3378        1.9    70.66   10.1    39.9    65  51945
5 California      21198   5114        1.1    71.71   10.3    62.6    20 156361
6   Colorado       2541   4884        0.7    72.06    6.8    63.9   166 103766

# Convert state.x77 into a dataframe and renaming rowname into State column
df <- tibble::rownames_to_column(data.frame(state.x77), var = "State")
## You can select any columns by their column names or by index
# by column names
 col_names <- c("Income", "HS.Grad")
 df[,col_names]
# by column index
 col_index <- c(3,7)
 df[, col_index]
# Filtering(subsetting) data by state
subset(df, df$State == "Arizona")
 State   Population Income  Illiteracy  Life.Exp Murder HS.Grad  Frost  Area
Arizona       2212   4530        1.8    70.55     7.8    58.1     15   113417

r语言 - 子集化 data.frame 中的特定列和行 - 错误消息"unexpected symbol in..."

相关内容

最新更新

热门标签：