我在城市间有贸易往来。例如,伦敦以1000英镑的价格将汽车卖给利物浦,而利物浦以1100英镑的价格从伦敦购买汽车。同样,我们知道从曼彻斯特到布里斯托尔(1700)的自行车的价格,但不知道从布里斯托尔到曼彻斯特的价格:
df <- data.frame (Product = c("Cars", "Cars", "Bike"),
Flow = c("Sell", "Purchase", "Sell"),
city = c("London", "Liverpool","manchester"),
Destination = c("Liverpool","London","Bristol"),
Price = c(1000, 1100, 1700))
Product Flow city Destination Price
1 Cars Sell London Liverpool 1000
2 Cars Purchase Liverpool London 1100
3 Bike Sell Manchester Bristol 1700
现在我想重塑数据并将交易时间放在同一行。预期结果:
Product city Destination Sell_P Purch_P
1 Cars London Liverpool 1000 1100
2 Bike Manchester Bristol 1700 NA
Sell_P表示从伦敦到利物浦的销售价格,而purchas_p表示利物浦从伦敦购买的价格。同一产品的价格差异是由保险和运输费用造成的
我们可以这样做:
在将名称粘合在一起之后,我们可以使用city和Destination的交替行为。通过分组,填充和切片每组的第一个,我们得到:
library(dplyr)
library(tidyr)
df %>%
pivot_wider(
names_from = Flow,
values_from = Price,
names_glue = "{Flow}_P"
) %>%
mutate(id = row_number()) %>%
group_by(Product) %>%
fill(Purchase_P, .direction="up") %>%
slice(1) %>%
ungroup() %>%
arrange(id) %>%
select(-id)
Product city Destination Sell_P Purchase_P
<chr> <chr> <chr> <dbl> <dbl>
1 Cars London Liverpool 1000 1100
2 Bike Manchester Bristol 1700 NA