Split() 由 2 个变量而不是 R 中的 1 个变量?



我想知道是否可以使用 split 函数按 2 个变量而不是仅 1 个变量来组织事物?

这是现在的代码。

holders <- split(z_combined_cost_dtrmnt, z_combined_cost_dtrmnt$val_lvl2 )
holders <- lapply(holders, function(x) x[!x$episode_count <= 3 | is.na(x$episode_count),])
holders <- lapply(holders, function(x){
x$prd_num_of_days_num <- remove_outliers(x$prd_num_of_days_num)
return(x) })
z_combined_cost_dtrmnt <- do.call(rbind, holders)
z_combined_cost_dtrmnt <-subset(z_combined_cost_dtrmnt, !is.na(z_combined_cost_dtrmnt$prd_num_of_days_num))

这现在运行良好,但我刚刚了解到我实际上需要按val_lvl2和val_lvl3排序以获取数据的唯一值,然后才能继续进一步操作。所以我试图做的本质上是这个

holders <- split(z_combined_cost_dtrmnt, z_combined_cost_dtrmnt$val_lvl2 & z_combined_cost_dtrmnt$val_lvl3 )

现在它现在没有在我的编译器中运行,但我想知道这是否可以通过某种其他方式?

电流输出:

Upper GI Endoscopy with Biopsy                                            :'data.frame':     292 obs. of  22 variables:
..$ mcp_cat_name                 : chr [1:292] "Digestive Conditions" "Digestive Conditions" "Digestive Conditions" "Digestive Conditions" ...
..$ pln_name                     : chr [1:292] "AR" "AR" "AR" "AR" ...
..$ hosp_refl_rgn_name           : chr [1:292] "Fort Smith, AR" "Fort Smith, AR" "Jonesboro, AR" "Jonesboro, AR" ...
..$ val_lvl1                     : chr [1:292] "Endoscopic Procedures" "Endoscopic Procedures" "Endoscopic Procedures" "Endoscopic Procedures" ...
..$ val_lvl2                     : chr [1:292] "Upper GI Endoscopy with Biopsy" "Upper GI Endoscopy with Biopsy" "Upper GI Endoscopy with Biopsy" "Upper GI Endoscopy with Biopsy" ...
..$ val_lvl3                     : chr [1:292] "Outpatient Hospital" "Surgical Center" "Outpatient Hospital" "Surgical Center" ...

预期产出:

Upper GI Endoscopy with Biopsy                                            :'data.frame':     146 obs. of  22 variables:
..$ mcp_cat_name                 : chr [1:146] "Digestive Conditions" "Digestive Conditions" "Digestive Conditions" "Digestive Conditions" ...
..$ pln_name                     : chr [1:146] "AR" "AR" "AR" "AR" ...
..$ hosp_refl_rgn_name           : chr [1:146] "Fort Smith, AR" "Fort Smith, AR" "Jonesboro, AR" "Jonesboro, AR" ...
..$ val_lvl1                     : chr [1:146] "Endoscopic Procedures" "Endoscopic Procedures" "Endoscopic Procedures" "Endoscopic Procedures" ...
..$ val_lvl2                     : chr [1:146] "Upper GI Endoscopy with Biopsy" "Upper GI Endoscopy with Biopsy" "Upper GI Endoscopy with Biopsy" "Upper GI Endoscopy with Biopsy" ...
..$ val_lvl3                     : chr [1:146] "Outpatient Hospital" "Outpatient Hospital" "Outpatient Hospital" ...

Upper GI Endoscopy with Biopsy                                            :'data.frame':     146 obs. of  22 variables:
..$ mcp_cat_name                 : chr [1:146] "Digestive Conditions" "Digestive Conditions" "Digestive Conditions" "Digestive Conditions" ...
..$ pln_name                     : chr [1:146] "AR" "AR" "AR" "AR" ...
..$ hosp_refl_rgn_name           : chr [1:146] "Fort Smith, AR" "Fort Smith, AR" "Jonesboro, AR" "Jonesboro, AR" ...
..$ val_lvl1                     : chr [1:146] "Endoscopic Procedures" "Endoscopic Procedures" "Endoscopic Procedures" "Endoscopic Procedures" ...
..$ val_lvl2                     : chr [1:146] "Upper GI Endoscopy with Biopsy" "Upper GI Endoscopy with Biopsy" "Upper GI Endoscopy with Biopsy" "Upper GI Endoscopy with Biopsy" ...
..$ val_lvl3                     : chr [1:146] "Surgical Center" "Surgical Center" "Surgical Center" "Surgical Center" ...

示例数据: 这是使用以下代码创建的...普特(头 (z_combined_cost_dtrmnt, 50((

dput(head (z_combined_cost_dtrmnt, 50))
structure(list(mcp_cat_name = c("Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions",
"Back and Neck Conditions", "Back and Neck Conditions", "Back and Neck Conditions"
), pln_name = c("AR", "AR", "AR", "AR", "AR", "AR", "AR", "AR",
"AR", "AR", "AR", "AR", "AR", "AR", "AR", "AR", "AR", "AR", "AR",
"AR", "AR", "AR", "AR", "AR", "AR", "AR", "AR", "AR", "AR", "AR",
"CA", "CA", "CA", "CA", "CA", "CA", "CA", "CA", "CA", "CA", "CA",
"CA", "CA", "CA", "CA", "CA", "CA", "CA", "CA", "CA"), hosp_refl_rgn_name = c("Fort Smith, AR",
"Fort Smith, AR", "Fort Smith, AR", "Fort Smith, AR", "Fort Smith, AR",
"Fort Smith, AR", "Jonesboro, AR", "Jonesboro, AR", "Jonesboro, AR",
"Jonesboro, AR", "Jonesboro, AR", "Jonesboro, AR", "Little Rock, AR",
"Little Rock, AR", "Little Rock, AR", "Little Rock, AR", "Little Rock, AR",
"Little Rock, AR", "Springdale, AR", "Springdale, AR", "Springdale, AR",
"Springdale, AR", "Springdale, AR", "Springdale, AR", "Texarkana, AR",
"Texarkana, AR", "Texarkana, AR", "Texarkana, AR", "Texarkana, AR",
"Texarkana, AR", "Alameda County, CA", "Alameda County, CA",
"Alameda County, CA", "Alameda County, CA", "Bakersfield, CA",
"Bakersfield, CA", "Bakersfield, CA", "Bakersfield, CA", "Chico, CA",
"Chico, CA", "Chico, CA", "Contra Costa County, CA", "Contra Costa County, CA",
"Contra Costa County, CA", "Contra Costa County, CA", "Fresno, CA",
"Fresno, CA", "Fresno, CA", "Fresno, CA", "Los Angeles, CA"),
val_lvl1 = c("Cervical (Neck) Pain", "Cervical (Neck) Pain",
"Lumbar (Low Back) Pain", "Lumbar (Low Back) Pain", "Lumbar (Low Back) Pain",
"Neuritis", "Cervical (Neck) Pain", "Cervical (Neck) Pain",
"Lumbar (Low Back) Pain", "Lumbar (Low Back) Pain", "Lumbar (Low Back) Pain",
"Neuritis", "Cervical (Neck) Pain", "Cervical (Neck) Pain",
"Lumbar (Low Back) Pain", "Lumbar (Low Back) Pain", "Lumbar (Low Back) Pain",
"Neuritis", "Cervical (Neck) Pain", "Cervical (Neck) Pain",
"Lumbar (Low Back) Pain", "Lumbar (Low Back) Pain", "Lumbar (Low Back) Pain",
"Neuritis", "Cervical (Neck) Pain", "Cervical (Neck) Pain",
"Lumbar (Low Back) Pain", "Lumbar (Low Back) Pain", "Lumbar (Low Back) Pain",
"Neuritis", "Cervical (Neck) Pain", "Lumbar (Low Back) Pain",
"Lumbar (Low Back) Pain", "Neuritis", "Cervical (Neck) Pain",
"Lumbar (Low Back) Pain", "Lumbar (Low Back) Pain", "Neuritis",
"Cervical (Neck) Pain", "Lumbar (Low Back) Pain", "Neuritis",
"Cervical (Neck) Pain", "Lumbar (Low Back) Pain", "Lumbar (Low Back) Pain",
"Neuritis", "Cervical (Neck) Pain", "Lumbar (Low Back) Pain",
"Lumbar (Low Back) Pain", "Neuritis", "Cervical (Neck) Pain"
), val_lvl2 = c("Cervical Fusion (Spinal Fusion)", "Non-Surgical Treatment",
"Lumbar Fusion (Spinal Fusion)", "Lumbar Laminectomy", "Non-Surgical Treatment",
"Non-Surgical Treatment", "Cervical Fusion (Spinal Fusion)",
"Non-Surgical Treatment", "Lumbar Fusion (Spinal Fusion)",
"Lumbar Laminectomy", "Non-Surgical Treatment", "Non-Surgical Treatment",
"Cervical Fusion (Spinal Fusion)", "Non-Surgical Treatment",
"Lumbar Fusion (Spinal Fusion)", "Lumbar Laminectomy", "Non-Surgical Treatment",
"Non-Surgical Treatment", "Cervical Fusion (Spinal Fusion)",
"Non-Surgical Treatment", "Lumbar Fusion (Spinal Fusion)",
"Lumbar Laminectomy", "Non-Surgical Treatment", "Non-Surgical Treatment",
"Cervical Fusion (Spinal Fusion)", "Non-Surgical Treatment",
"Lumbar Fusion (Spinal Fusion)", "Lumbar Laminectomy", "Non-Surgical Treatment",
"Non-Surgical Treatment", "Non-Surgical Treatment", "Lumbar Fusion (Spinal Fusion)",
"Non-Surgical Treatment", "Non-Surgical Treatment", "Non-Surgical Treatment",
"Lumbar Fusion (Spinal Fusion)", "Non-Surgical Treatment",
"Non-Surgical Treatment", "Non-Surgical Treatment", "Non-Surgical Treatment",
"Non-Surgical Treatment", "Non-Surgical Treatment", "Lumbar Fusion (Spinal Fusion)",
"Non-Surgical Treatment", "Non-Surgical Treatment", "Non-Surgical Treatment",
"Lumbar Fusion (Spinal Fusion)", "Non-Surgical Treatment",
"Non-Surgical Treatment", "Non-Surgical Treatment"), val_lvl3 = c("Inpatient Hospital",
"Alternative to Surgical Treatment of Cervical (Neck) Pain",
"Inpatient Hospital", "Outpatient Hospital", "Alternative to Surgical Treatment of Lumbar (Low Back) Pain",
"Alternative to Surgical Treatment of Neuritis", "Inpatient Hospital",
"Alternative to Surgical Treatment of Cervical (Neck) Pain",
"Inpatient Hospital", "Outpatient Hospital", "Alternative to Surgical Treatment of Lumbar (Low Back) Pain",
"Alternative to Surgical Treatment of Neuritis", "Inpatient Hospital",
"Alternative to Surgical Treatment of Cervical (Neck) Pain",
"Inpatient Hospital", "Outpatient Hospital", "Alternative to Surgical Treatment of Lumbar (Low Back) Pain",
"Alternative to Surgical Treatment of Neuritis", "Inpatient Hospital",
"Alternative to Surgical Treatment of Cervical (Neck) Pain",
"Inpatient Hospital", "Outpatient Hospital", "Alternative to Surgical Treatment of Lumbar (Low Back) Pain",
"Alternative to Surgical Treatment of Neuritis", "Inpatient Hospital",
"Alternative to Surgical Treatment of Cervical (Neck) Pain",
"Inpatient Hospital", "Outpatient Hospital", "Alternative to Surgical Treatment of Lumbar (Low Back) Pain",
"Alternative to Surgical Treatment of Neuritis", "Alternative to Surgical Treatment of Cervical (Neck) Pain",
"Inpatient Hospital", "Alternative to Surgical Treatment of Lumbar (Low Back) Pain",
"Alternative to Surgical Treatment of Neuritis", "Alternative to Surgical Treatment of Cervical (Neck) Pain",
"Inpatient Hospital", "Alternative to Surgical Treatment of Lumbar (Low Back) Pain",
"Alternative to Surgical Treatment of Neuritis", "Alternative to Surgical Treatment of Cervical (Neck) Pain",
"Alternative to Surgical Treatment of Lumbar (Low Back) Pain",
"Alternative to Surgical Treatment of Neuritis", "Alternative to Surgical Treatment of Cervical (Neck) Pain",
"Inpatient Hospital", "Alternative to Surgical Treatment of Lumbar (Low Back) Pain",
"Alternative to Surgical Treatment of Neuritis", "Alternative to Surgical Treatment of Cervical (Neck) Pain",
"Inpatient Hospital", "Alternative to Surgical Treatment of Lumbar (Low Back) Pain",
"Alternative to Surgical Treatment of Neuritis", "Alternative to Surgical Treatment of Cervical (Neck) Pain"
), val_lvl4 = c("", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", ""), ntwk_avg_low_range_billed_amt = c(80359,
156, 107300, 51324, 156, 156, 80273, 139, 107333, 51287,
139, 139, 80351, 151, 107334, 51343, 151, 151, 80270, 148,
107192, 51146, 148, 148, 80388, 165, 107375, 51381, 165,
165, 215, 140194, 215, 215, 171, 140051, 171, 171, 158, 158,
158, 205, 140267, 205, 205, 171, 140318, 171, 171, 205),
ntwk_avg_low_range_alwd_amt = c(36707, 116, 53412, 19115,
116, 116, 36700, 126, 53476, 19120, 126, 126, 36681, 121,
53412, 19060, 121, 121, 36677, 125, 53375, 19018, 125, 125,
36741, 135, 53475, 19143, 135, 135, 164, 58285, 164, 164,
111, 58046, 111, 111, 111, 111, 111, 147, 58277, 147, 147,
117, 58131, 117, 117, 130), ntwk_avg_avg_billed_amt = c(99032,
554, 139522, 51324, 554, 554, 98926, 495, 139566, 51287,
495, 495, 99021, 538, 139568, 51343, 538, 538, 98922, 526,
139383, 51146, 526, 526, 99067, 585, 139621, 51381, 585,
585, 693, 140194, 693, 693, 551, 140051, 551, 551, 512, 512,
512, 662, 140267, 662, 662, 553, 140318, 553, 553, 661),
ntwk_avg_avg_alwd_amt = c(41040, 313, 57902, 19115, 313,
313, 41033, 340, 57972, 19120, 340, 340, 41011, 326, 57902,
19060, 326, 326, 41007, 338, 57862, 19018, 338, 338, 41079,
365, 57970, 19143, 365, 365, 451, 58285, 451, 451, 306, 58046,
306, 306, 305, 305, 305, 403, 58277, 403, 403, 320, 58131,
320, 320, 356), ntwk_avg_hi_range_billed_amt = c(104618,
559, 171745, 51324, 559, 559, 104506, 500, 171800, 51287,
500, 500, 104607, 543, 171801, 51343, 543, 543, 104502, 532,
171574, 51146, 532, 532, 104655, 591, 171867, 51381, 591,
591, 799, 140194, 799, 799, 635, 140051, 635, 635, 590, 590,
590, 764, 140267, 764, 764, 638, 140318, 638, 638, 762),
ntwk_avg_hi_range_alwd_amt = c(46388, 318, 62393, 19115,
318, 318, 46380, 345, 62467, 19120, 345, 345, 46355, 331,
62393, 19060, 331, 331, 46351, 343, 62349, 19018, 343, 343,
46432, 371, 62466, 19143, 371, 371, 537, 58285, 537, 537,
365, 58046, 365, 365, 364, 364, 364, 481, 58277, 481, 481,
382, 58131, 382, 382, 424), episode_count = c(5L, 284L, 2L,
1L, 284L, 284L, 5L, 284L, 2L, 1L, 284L, 284L, 5L, 284L, 2L,
1L, 284L, 284L, 5L, 284L, 2L, 1L, 284L, 284L, 5L, 284L, 2L,
1L, 284L, 284L, 148L, 1L, 148L, 148L, 148L, 1L, 148L, 148L,
148L, 148L, 148L, 148L, 1L, 148L, 148L, 148L, 1L, 148L, 148L,
148L), sample_size = c(12.7788970978329, 326.969758402962,
3.25471779465034, NA, 326.969758402962, 326.969758402962,
12.7788970978329, 326.969758402962, 3.25471779465034, NA,
326.969758402962, 326.969758402962, 12.7788970978329, 326.969758402962,
3.25471779465034, NA, 326.969758402962, 326.969758402962,
12.7788970978329, 326.969758402962, 3.25471779465034, NA,
326.969758402962, 326.969758402962, 12.7788970978329, 326.969758402962,
3.25471779465034, NA, 326.969758402962, 326.969758402962,
282.202307833077, NA, 282.202307833077, 282.202307833077,
282.202307833077, NA, 282.202307833077, 282.202307833077,
282.202307833077, 282.202307833077, 282.202307833077, 282.202307833077,
NA, 282.202307833077, 282.202307833077, 282.202307833077,
NA, 282.202307833077, 282.202307833077, 282.202307833077),
in_map = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA), in_map.x = c(NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA), in_trmnt = c(NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), in_map.y = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA), in_complete = c(NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA),
in_miss = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA), prd_num_of_days_num = c(167,
46, 117, 209, 46, 46, 167, 46, 117, 209, 46, 46, 167, 46,
117, 209, 46, 46, 167, 46, 117, 209, 46, 46, 167, 46, 117,
209, 46, 46, 38, 339, 38, 38, 38, 339, 38, 38, 38, 38, 38,
38, 339, 38, 38, 38, 339, 38, 38, 38)), .Names = c("mcp_cat_name",
"pln_name", "hosp_refl_rgn_name", "val_lvl1", "val_lvl2", "val_lvl3",
"val_lvl4", "ntwk_avg_low_range_billed_amt", "ntwk_avg_low_range_alwd_amt",
"ntwk_avg_avg_billed_amt", "ntwk_avg_avg_alwd_amt", "ntwk_avg_hi_range_billed_amt",
"ntwk_avg_hi_range_alwd_amt", "episode_count", "sample_size",
"in_map", "in_map.x", "in_trmnt", "in_map.y", "in_complete",
"in_miss", "prd_num_of_days_num"), row.names = c(NA, 50L), class = "data.frame")

没有示例数据很难回答,但您可以尝试

split(z_combined_cost_dtrmnt, 
interaction(
z_combined_cost_dtrmnt$val_lvl2, 
z_combined_cost_dtrmnt$val_lvl3
)
)

interaction创建一个新因子,该因子是 lvl2 和 lvl3 因子的组合,因此它应按唯一因子组合拆分数据。我希望这相当于

split(z_combined_cost_dtrmnt, 
f = list(
z_combined_cost_dtrmnt$val_lvl2, 
z_combined_cost_dtrmnt$val_lvl3
)
)

最新更新