r-将具有X个重复属性的多个列合并为X列



我有一个如下排列的数据帧,列按月份(enero、febrero、marzo等(分隔,每一行都对应于我需要从时间序列中提取的值。每对Month/Caudal的大小因一个月的天数而异。

此外,基于原始数据集,每对Month/Caudal由一列空的NA分隔。

enero Caudal  X        febrero Caudal.1 X.1          marzo Caudal.2 X.2
1 1/1/2003 00:15   -    NA 1/2/2003 00:15     -     NA 1/3/2003 00:15    1.68   NA
2 1/1/2003 00:30   -    NA 1/2/2003 00:30     -     NA 1/3/2003 00:30    1.69   NA
3 1/1/2003 00:45   -    NA 1/2/2003 00:45     -     NA 1/3/2003 00:45    1.68   NA
4 1/1/2003 01:00   -    NA 1/2/2003 01:00     -     NA 1/3/2003 01:00    1.68   NA
5 1/1/2003 01:15   -    NA 1/2/2003 01:15     -     NA 1/3/2003 01:15    1.68   NA
6 1/1/2003 01:30   -    NA 1/2/2003 01:30     -     NA 1/3/2003 01:30    1.68   NA

我想要的结果是一个只有两列的时间序列:Date和Caudal。

Date         Caudal
1 1/1/2003  00:15     -   
2 1/1/2003  00:30     -   
3 1/1/2003  00:45     -   
4 1/1/2003  01:00     -   
5 1/1/2003  01:15     -   
6 1/1/2003  01:30     - 
7 1/2/2003  00:15     -   
8 1/2/2003  00:30     -   
9 1/2/2003  00:45     -   
10 1/2/2003 01:00     -   
11 1/2/2003 01:15     -   
12 1/2/2003 01:30     -   
13 1/3/2003 00:15    1.68 
14 1/3/2003 00:30    1.69 
15 1/3/2003 00:45    1.68 
16 1/3/2003 01:00    1.68 
17 1/3/2003 01:15    1.68 
18 1/3/2003 01:30    1.68 

我需要对40个格式完全相同的.txt文件执行此操作。我如何安排它将我的所有文件连接到一个连续的df中?

样本数据:

structure(list(enero = c("1/1/2003 00:15", "1/1/2003 00:30", 
"1/1/2003 00:45", "1/1/2003 01:00", "1/1/2003 01:15", "1/1/2003 01:30"
), Caudal = c(" -   ", " -   ", " -   ", " -   ", " -   ", " -   "
), X = c(NA, NA, NA, NA, NA, NA), febrero = c("1/2/2003 00:15", 
"1/2/2003 00:30", "1/2/2003 00:45", "1/2/2003 01:00", "1/2/2003 01:15", 
"1/2/2003 01:30"), Caudal.1 = c(" -   ", " -   ", " -   ", " -   ", 
" -   ", " -   "), X.1 = c(NA, NA, NA, NA, NA, NA), marzo = c("1/3/2003 00:15", 
"1/3/2003 00:30", "1/3/2003 00:45", "1/3/2003 01:00", "1/3/2003 01:15", 
"1/3/2003 01:30"), Caudal.2 = c(" 1.68 ", " 1.69 ", " 1.68 ", 
" 1.68 ", " 1.68 ", " 1.68 "), X.2 = c(NA, NA, NA, NA, NA, NA
)), row.names = c(NA, 6L), class = "data.frame")

我们可以首先删除空列,然后最容易地重命名列集(即Date和Caudal(。然后,我们可以使用_作为名称分隔符转换为长格式。

library(tidyverse)
df %>%
select(-starts_with("X")) %>%
rename_with(~paste0("Date_", seq_along(.)),
-starts_with("Caudal")) %>%
rename_with(~paste0("Caudal_", seq_along(.)),
starts_with("Caudal")) %>%
pivot_longer(everything(),
names_to = c(".value", "time"),
names_sep = "_",
values_drop_na = TRUE) %>% 
select(-time) %>% 
arrange(Date)

输出

Date           Caudal  
<chr>          <chr>   
1 1/1/2003 00:15 " -   " 
2 1/1/2003 00:30 " -   " 
3 1/1/2003 00:45 " -   " 
4 1/1/2003 01:00 " -   " 
5 1/1/2003 01:15 " -   " 
6 1/1/2003 01:30 " -   " 
7 1/2/2003 00:15 " -   " 
8 1/2/2003 00:30 " -   " 
9 1/2/2003 00:45 " -   " 
10 1/2/2003 01:00 " -   " 
11 1/2/2003 01:15 " -   " 
12 1/2/2003 01:30 " -   " 
13 1/3/2003 00:15 " 1.68 "
14 1/3/2003 00:30 " 1.69 "
15 1/3/2003 00:45 " 1.68 "
16 1/3/2003 01:00 " 1.68 "
17 1/3/2003 01:15 " 1.68 "
18 1/3/2003 01:30 " 1.68 "

相关内容

  • 没有找到相关文章