如何根据每日数据帧计算月年平均值,并用缩写月绘制



我有几年的降水量和温度的日值。我想计算一年中每个月(1月至12月(的平均降水量和温度。对于降水量,我首先需要计算每个月的日降水量总和,然后计算所有年份数据的同月平均值。对于温度,我需要对这些值的月平均值进行平均(因此,所有月份的所有数据的平均值会给出完全相同的结果(。完成后,我需要使用缩写的月份绘制两组数据(降水量和温度(。

我找不到计算降水量的方法,也找不到能够获得每个月的总和,然后对所有年份进行平均的方法。此外,我在显示缩写月份的格式时遇到了问题。

这是我迄今为止尝试过的(没有成功(:

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
example = [['01.10.1965 00:00', 13.88099957,    5.375],
['02.10.1965 00:00',    5.802999973,    3.154999971],
['03.10.1965 00:00',    9.605699539,    0.564999998],
['14.10.1965 00:00',    0.410299987,    1.11500001],
['31.10.1965 00:00',    6.184500217,    -0.935000002],
['01.11.1965 00:00',    0.347299993,    -5.235000134],
['02.11.1965 00:00',    0.158299997,    -8.244999886],
['03.11.1965 00:00',    1.626199961,    -3.980000019],
['24.10.1966 00:00',    0,              3.88499999],
['25.10.1966 00:00',    0.055100001,    1.279999971],
['30.10.1966 00:00',    0.25940001,     -5.554999828]]
names = ["date","Pobs","Tobs"]
data = pd.DataFrame(example, columns=names)
data['date'] = pd.to_datetime(data['date'], format='%d.%m.%Y %H:%M')
#I think the average of temperature is well computed but the precipitation would give the complete summation for all years!
tempT = data.groupby([data['date'].dt.month_name()], sort=False).mean().eval('Tobs')
tempP = data.groupby([data['date'].dt.month_name()], sort=False).sum().eval('Pobs') 
fig = plt.figure(); ax1 = fig.add_subplot(1,1,1); ax2 = ax1.twinx();
ax1.bar(tempP.index.tolist(), tempP.values, color='blue')
ax2.plot(tempT.index.tolist(), tempT.values, color='red')
ax1.set_ylabel('Precipitation [mm]', fontsize=10)
ax2.set_ylabel('Temperature [°C]', fontsize=10) 
#ax1.xaxis.set_major_formatter(DateFormatter("%b")) #this line does not work properly!
plt.show()

以下是您的问题的工作代码:

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
import matplotlib.dates as mdates
example = [['01.10.1965 00:00',13.88099957,5.375], ...]
names = ["date","Pobs","Tobs"]
data = pd.DataFrame(example, columns=names)
data['date'] = pd.to_datetime(data['date'], format='%d.%m.%Y %H:%M')
# Temperature:
tempT = data.groupby([data['date'].dt.month_name()], sort=False).mean().eval('Tobs')
# Precipitation:
df_sum = data.groupby([data['date'].dt.month_name(), data['date'].dt.year], sort=False).sum()  # get sum for each individual month
df_sum.index.rename(['month','year'], inplace=True)  # just renaming the index
df_sum.reset_index(level=0, inplace=True)  # make the month-index to a column
tempP = df_sum.groupby([df_sum['month']], sort=False).mean().eval('Pobs')  # get mean over all years
fig = plt.figure();
ax1 = fig.add_subplot(1,1,1);
ax2 = ax1.twinx();
xticks = pd.to_datetime(tempP.index.tolist(), format='%B').sort_values() # must work for both axes
ax1.bar(xticks, tempP.values, color='blue')
ax2.plot(xticks, tempT.values, color='red')
plt.xticks(pd.to_datetime(tempP.index.tolist(), format='%B').sort_values()) # to show all ticks
ax1.xaxis.set_major_formatter(mdates.DateFormatter("%b")) # must be called after plotting both axes
ax1.set_ylabel('Precipitation [mm]', fontsize=10)
ax2.set_ylabel('Temperature [°C]', fontsize=10)
plt.show()

说明:从这个StackOverflow答案开始,DateFormatter使用mdates。为了实现这一点,您需要根据月份名称创建一个DatetimeIndex数组,然后DateFormatter可以对其进行重新格式化。

至于计算,我理解你的问题的解决方案,我们在每个月内取的总和,然后取这些总和在所有年份的平均值。这就给你留下了多年来每月的平均总降水量。

最新更新