r语言 - 不能 可能存在的子集列



我正在尝试更改一些列的名称,并删除与此用例无关的其他列。

数据来源:

data <- read.csv("data/building_permits.csv")

数据检查

colnames(data)

数据集列名称

[1] "Permit.Number"                                                 
[2] "Permit.Type"                                                   
[3] "Permit.Type.Definition"                                        
[4] "Permit.Creation.Date"                                          
[5] "Block"                                                         
[6] "Lot"                                                           
[7] "Street.Number"                                                 
[8] "Street.Number.Suffix"                                          
[9] "Street.Name"                                                   
[10] "Street.Suffix"                                                 
[11] "Unit"                                                          
[12] "Unit.Suffix"                                                   
[13] "Description"                                                   
[14] "Current.Status"                                                
[15] "Current.Status.Date"                                           
[16] "Filed.Date"                                                    
[17] "Issued.Date"                                                   
[18] "Completed.Date"                                                
[19] "First.Construction.Document.Date"                              
[20] "Structural.Notification"                                       
[21] "Number.of.Existing.Stories"                                    
[22] "Number.of.Proposed.Stories"                                    
[23] "Voluntary.Soft.Story.Retrofit"                                 
[24] "Fire.Only.Permit"                                              
[25] "Permit.Expiration.Date"                                        
[26] "Estimated.Cost"                                                
[27] "Revised.Cost"                                                  
[28] "Existing.Use"                                                  
[29] "Existing.Units"                                                
[30] "Proposed.Use"                                                  
[31] "Proposed.Units"                                                
[32] "Plansets"                                                      
[33] "TIDF.Compliance"                                               
[34] "Existing.Construction.Type"                                    
[35] "Existing.Construction.Type.Description"                        
[36] "Proposed.Construction.Type"                                    
[37] "Proposed.Construction.Type.Description"                        
[38] "Site.Permit"                                                   
[39] "Supervisor.District"                                           
[40] "Neighborhoods...Analysis.Boundaries"                           
[41] "Zipcode"                                                       
[42] "Location"                                                      
[43] "Record.ID"                                                     
[44] "SF.Find.Neighborhoods"                                         
[45] "Current.Police.Districts"                                      
[46] "Current.Supervisor.Districts"                                  
[47] "Analysis.Neighborhoods"                                        
[48] "DELETE...Zip.Codes"                                            
[49] "DELETE...Fire.Prevention.Districts"                            
[50] "DELETE...Supervisor.Districts"                                 
[51] "DELETE...Current.Police.Districts"                             
[52] "DELETE...Supervisorial_Districts_Waterline_data_from_7pkg_wer3"

列名数据长度:

length(colnames(data))

长度(colnames(data(([1] 52

删除列

colremove = c("First Construction Document Date",
"Structural Notification",
"Number of Existing Stories",
"Number of Proposed Stories",
"Voluntary Soft Story Retrofit",
"Fire Only Permit","Existing Units",
"Proposed Units","Plansets",
"TIDF Compliance","Existing Construction Type",
"Proposed Construction Type","Site Permit",
"Supervisor District","Current Police Districts",
"Current Supervisor Districts",
"Current Status Date", "Permit Creation Date",
"Analysis Neighborhoods","Lot","Location",
"SF Find Neighborhoods","Unit","Block", "Permit Type",
"Unit Suffix","Street Number Suffix",
"Existing Construction Type Description")
data <- data[colnames(data)[1:47]] %>% select(-all_of(colremove))

此处显示错误:

错误:无法对不存在的列进行子集设置。x列First Construction Document DateStructural NotificationNumber of Existing StoriesNumber of Proposed StoriesVoluntary Soft Story Retrofit等不存在。

如果要继续使用dplyr,则要查找的选择辅助对象是any_of(),而不是all_of()

我已经解决了我所面临的问题。

data <- data[1:47,!(names(data) %in% colremove)]

它有助于删除列,并有助于分配原始数据集中的数据。

最新更新