我有一个csv文件,其中有两个字段:日期(1美元(和全年的日温度(2美元(,我想提取4月至9月的温度,但每个月都在另一列上,如下所示:
April | May |
---|---|
17 C | 20 C |
15 C | 22 C |
15 C | 21 C |
查看以下脚本(temperature.awk
(:
BEGIN {
SUBSEP="@"
}
{
month=0+substr($1,6,2);
day=0+substr($1,9,2);
a[month,day]=$2;
}
END{
printf("%5s ","")
for (month=1; month<=12; month++) {
printf("%5s ", month);
}
printf("n");
for (day=1; day<=31; day++) {
printf("%4s: ", day)
for (month=1;month<=12; month++) {
printf("%5s ", a[month,day])
}
printf("n")
}
}
执行时:gawk -F, -f temperature.awk year-temperatures.csv >> temp.csv
你的temp.csv
应该是这样的(根据我的测试数据(:
1 2 3 4 5 6 7 8 9 10 11 12
1: 17.5 19.9 21.5 19.6 18.7 14.2 18.5 18.9 15.9 14.3 21.4 21.4
2: 18.6 20.7 17.6 14 12.7 13.4 17.1 12.3 21.6 17.3 18.8 12.8
3: 18.3 21.8 21.8 19.1 15.6 12.5 18 12.8 18.5 21.7 17.6 17.8
4: 14 14.7 13.9 21.6 18 20.3 16.8 15 15.7 14.4 19.5 18.7
5: 12.7 16.3 12.3 18.7 20.9 12.1 18.1 14.5 21.1 15 12.6 18.1
6: 19.7 15.2 17.7 16.5 18.6 17.4 17.9 15.4 16.4 19.9 12.7 12.2
7: 18.3 15.1 19.7 14.6 18.2 18.7 13.2 21.8 16.5 12.4 13.8 15
8: 20.2 18.2 13.5 21.3 13.4 19.4 20.2 20.6 21.5 20.3 18.7 16.2
9: 14.4 13.4 16.4 20.8 20.3 18.8 19.5 15.7 15.7 12.4 20.3 14.1
10: 19.4 20.7 19.3 18.2 19.4 14 14.9 14.7 12.2 19.1 13.2 20
11: 21.8 21.2 15.2 16.7 14 21.4 14.1 14.5 12.1 16.3 13.4 15.8
12: 18.8 21.9 16.2 16.7 20 13.3 13.8 16.2 21.6 12.2 15.1 16.8
13: 16.5 14 13.4 21.5 16 20 14.7 15.5 19.7 20 13.4 14.7
14: 14.3 12.2 16.2 15.5 18 18.1 20 17 21.9 21.3 19.9 21.2
15: 20 16.9 19.1 21.1 19.7 18.4 14.1 16.3 18.5 14.6 17.2 19.7
16: 15.1 16.1 14.8 16.9 12.8 15.8 18.2 18.5 14.7 16.9 14.1 13.1
17: 13.3 17.7 14.7 19.2 12.9 21.6 16.8 21.6 16.2 19 17.1 14.1
18: 19.5 18.3 17.3 13.3 14.2 18.9 17.4 20.4 14.6 12.4 21.3 19.5
19: 15.4 16.3 20.1 16.8 20.2 17.6 14.4 15.4 12.6 12.8 13 13
20: 16.8 14.7 16.6 12.2 16.2 19.3 18 13.8 17 14.9 19 14.5
21: 15.4 12.4 20.6 18.6 18.7 21.8 14.7 20.6 15.1 13.9 14.1 21.8
22: 14.9 16.1 21.4 14.4 12.8 19.2 17.5 19.5 12.8 12.7 21.5 13.1
23: 16.3 21.1 12.9 14.3 16.1 18.6 21.3 13.9 16.6 20.2 13.2 18.5
24: 14.9 15.3 18.7 16.3 19.8 13.5 12.1 19 12.7 20.5 19.5 20.9
25: 13.3 21 12.5 16.5 18.9 19.4 14.8 21.3 21.5 20.2 15.9 17
26: 20 17.4 14.4 21.7 12.8 14.6 15.5 17.4 17.5 17.5 18.9 20.2
27: 18 12 12.5 17.1 15.7 12.9 21 21.2 20.8 15 14.8 18.3
28: 17.9 15.9 17.6 18.2 17.7 18.5 16.7 21.8 19.6 20.2 15.6 18.7
29: 13.8 18.2 17.9 19.7 21.7 18.6 13.4 13.7 14.1 21.2 16.7
30: 13.1 16.1 12.9 13.3 21.1 20.9 19.5 17.5 18 17.4 15.3
31: 12.3 14 15.2 16.7 15.3 15.5 14.4
我的测试数据中的前几行是这样的:
2022-01-01, 17.5
2022-01-02, 18.6
2022-01-03, 18.3
2022-01-04, 14
2022-01-05, 12.7
2022-01-06, 19.7
2022-01-07, 18.3
2022-01-08, 20.2