我有几年的降水量和温度的日值。我想计算一年中每个月(1月至12月(的平均降水量和温度。对于降水量,我首先需要计算每个月的日降水量总和,然后计算所有年份数据的同月平均值。对于温度,我需要对这些值的月平均值进行平均(因此,所有月份的所有数据的平均值会给出完全相同的结果(。完成后,我需要使用缩写的月份绘制两组数据(降水量和温度(。
我找不到计算降水量的方法,也找不到能够获得每个月的总和,然后对所有年份进行平均的方法。此外,我在显示缩写月份的格式时遇到了问题。
这是我迄今为止尝试过的(没有成功(:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
example = [['01.10.1965 00:00', 13.88099957, 5.375],
['02.10.1965 00:00', 5.802999973, 3.154999971],
['03.10.1965 00:00', 9.605699539, 0.564999998],
['14.10.1965 00:00', 0.410299987, 1.11500001],
['31.10.1965 00:00', 6.184500217, -0.935000002],
['01.11.1965 00:00', 0.347299993, -5.235000134],
['02.11.1965 00:00', 0.158299997, -8.244999886],
['03.11.1965 00:00', 1.626199961, -3.980000019],
['24.10.1966 00:00', 0, 3.88499999],
['25.10.1966 00:00', 0.055100001, 1.279999971],
['30.10.1966 00:00', 0.25940001, -5.554999828]]
names = ["date","Pobs","Tobs"]
data = pd.DataFrame(example, columns=names)
data['date'] = pd.to_datetime(data['date'], format='%d.%m.%Y %H:%M')
#I think the average of temperature is well computed but the precipitation would give the complete summation for all years!
tempT = data.groupby([data['date'].dt.month_name()], sort=False).mean().eval('Tobs')
tempP = data.groupby([data['date'].dt.month_name()], sort=False).sum().eval('Pobs')
fig = plt.figure(); ax1 = fig.add_subplot(1,1,1); ax2 = ax1.twinx();
ax1.bar(tempP.index.tolist(), tempP.values, color='blue')
ax2.plot(tempT.index.tolist(), tempT.values, color='red')
ax1.set_ylabel('Precipitation [mm]', fontsize=10)
ax2.set_ylabel('Temperature [°C]', fontsize=10)
#ax1.xaxis.set_major_formatter(DateFormatter("%b")) #this line does not work properly!
plt.show()
以下是您的问题的工作代码:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
import matplotlib.dates as mdates
example = [['01.10.1965 00:00',13.88099957,5.375], ...]
names = ["date","Pobs","Tobs"]
data = pd.DataFrame(example, columns=names)
data['date'] = pd.to_datetime(data['date'], format='%d.%m.%Y %H:%M')
# Temperature:
tempT = data.groupby([data['date'].dt.month_name()], sort=False).mean().eval('Tobs')
# Precipitation:
df_sum = data.groupby([data['date'].dt.month_name(), data['date'].dt.year], sort=False).sum() # get sum for each individual month
df_sum.index.rename(['month','year'], inplace=True) # just renaming the index
df_sum.reset_index(level=0, inplace=True) # make the month-index to a column
tempP = df_sum.groupby([df_sum['month']], sort=False).mean().eval('Pobs') # get mean over all years
fig = plt.figure();
ax1 = fig.add_subplot(1,1,1);
ax2 = ax1.twinx();
xticks = pd.to_datetime(tempP.index.tolist(), format='%B').sort_values() # must work for both axes
ax1.bar(xticks, tempP.values, color='blue')
ax2.plot(xticks, tempT.values, color='red')
plt.xticks(pd.to_datetime(tempP.index.tolist(), format='%B').sort_values()) # to show all ticks
ax1.xaxis.set_major_formatter(mdates.DateFormatter("%b")) # must be called after plotting both axes
ax1.set_ylabel('Precipitation [mm]', fontsize=10)
ax2.set_ylabel('Temperature [°C]', fontsize=10)
plt.show()
说明:从这个StackOverflow答案开始,DateFormatter使用mdates。为了实现这一点,您需要根据月份名称创建一个DatetimeIndex数组,然后DateFormatter可以对其进行重新格式化。
至于计算,我理解你的问题的解决方案,我们在每个月内取的总和,然后取这些总和在所有年份的平均值。这就给你留下了多年来每月的平均总降水量。