r-运行逻辑测试或计算,与单独索引列中的第一个进行比较



我有一个带列索引的大数据帧,它重复分配给特定行活动的数值。我希望能够运行一个引用该索引列的计算,并将包含该引用值的第一个日期起的天数作为一个单独的列进行计数,同时使用一个单独列进行逻辑测试,以确保单独列中包含的值与该列中该索引值的第一值匹配。我一直在使用dplyr,并有以下脚本:

test <- InsiderList3 %>%
group_by(`Insider CIK`) %>%
mutate(rf.diff =  first(`Transaction Date`)-`Transaction Date`) %>%
mutate(IssuerCheck =  first(`Issuer`) ==Issuer)

标记为"Insider CIK"的列是索引,所有其他列的信息都与此绑定,直到弹出下一个索引值,重复该过程。有一个单独的日期列和标识公司的信息。

前20行样本的dput:

dput(head(InsiderList3[c('Insider CIK', 'Transaction Date', 'Issuer')], 75))
structure(list(`Insider CIK` = c("0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134", 
"0001008134", "0001008134", "0001008134", "0001009891", "0001012859", 
"0001012859", "0001012859", "0001012859"), `Transaction Date` = structure(c(18358, 
18358, 18101, 18065, 18065, 18039, 17729, 17700, 17674, 17674, 
17345, 17345, 17326, 17014, 17014, 17014, 17014, 17014, 17014, 
17001, 16964, 16964, 16598, 16590, 16582, 16582, 16409, 16288, 
16288, 16245, 16245, 16217, 16161, 16072, 16052, 15967, 15880, 
15869, 15771, 15710, 15710, 15687, 15603, 15523, 15354, 15354, 
15030, 14979, 14840, 14049, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 18358, 18358, 
18358, 18261), class = "Date"), Issuer = c("TRANSATLANTIC PETROLEUM LTD.", 
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.", 
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.", 
"SANDRIDGE ENERGY INC", "SANDRIDGE ENERGY INC", "TRANSATLANTIC PETROLEUM LTD.", 
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.", 
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.", 
"TRANSATLANTIC PETROLEUM LTD.", "Seventy Seven Energy Inc.", 
"Seventy Seven Energy Inc.", "Seventy Seven Energy Inc.", "Seventy Seven Energy Inc.", 
"Seventy Seven Energy Inc.", "Seventy Seven Energy Inc.", "TRANSATLANTIC PETROLEUM LTD.", 
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.", 
"TRANSATLANTIC PETROLEUM LTD.", "Seventy Seven Energy Inc.", 
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.", 
"Seventy Seven Energy Inc.", "Seventy Seven Energy Inc.", "Seventy Seven Energy Inc.", 
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.", 
"TRANSATLANTIC PETROLEUM LTD.", "CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP", 
"CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP", "TRANSATLANTIC PETROLEUM LTD.", 
"CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP", "TRANSATLANTIC PETROLEUM LTD.", 
"TRANSATLANTIC PETROLEUM LTD.", "CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP", 
"CHESAPEAKE ENERGY CORP", "TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.", 
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.", 
"TRANSATLANTIC PETROLEUM LTD.", "QUEST RESOURCE CORP", "QUEST RESOURCE CORP", 
"CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP", 
"CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP", "TRANSATLANTIC PETROLEUM LTD.", 
"CHESAPEAKE ENERGY CORP", "Seventy Seven Energy Inc.", "CHESAPEAKE OILFIELD OPERATING LLC", 
"TRANSATLANTIC PETROLEUM LTD.", "QUEST RESOURCE CORP", "CHESAPEAKE ENERGY CORP", 
"CHESAPEAKE ENERGY CORP", "CVR ENERGY INC", "CHESAPEAKE ENERGY CORP", 
"SANDRIDGE ENERGY INC", "TRANSATLANTIC PETROLEUM LTD.", "Seventy Seven Energy Inc.", 
"CHESAPEAKE ENERGY CORP", NA, "NATIONAL HEALTHCARE CORP", "NATIONAL HEALTHCARE CORP", 
"NATIONAL HEALTHCARE CORP", "NATIONAL HEALTHCARE CORP")), row.names = c(NA, 
75L), class = "data.frame")

感谢您的帮助。

也许我遗漏了什么,但这不只是按'Transaction date'排序的问题吗?

InsiderList3 %>%
group_by(`Insider CIK`) %>%
arrange(`Transaction Date`) %>%
mutate(rf.diff =  first(`Transaction Date`) - `Transaction Date`,
IssuerCheck =  first(`Issuer`) == Issuer)

相关内容

最新更新