为什么 python 无法识别数据集中的月份列?



数据框是这样的:

Date  Time (HHMM)         Site  Plot  Replicate  Temperature  
0   2002-05-01          600  Barre Woods    16          5          4.5
1   2002-05-01          600  Barre Woods    21          7          4.5
2   2002-05-01          600  Barre Woods    31          9          6.5
3   2002-05-01          600  Barre Woods    10          2          5.3
4   2002-05-01          600  Barre Woods     2          1          4.0
5   2002-05-01          600  Barre Woods    13          4          5.5
6   2002-05-01          600  Barre Woods    11          3          5.0
7   2002-05-01          600  Barre Woods    28          8          5.0
8   2002-05-01          600  Barre Woods    18          6          4.5
9   2002-05-01         1400  Barre Woods     2          1         10.3
10  2002-05-01         1400  Barre Woods    31          9          9.0
11  2002-05-01         1400  Barre Woods    13          4         11.0
import pandas as pd
import datetime as dt
from datetime import datetime
df=pd.read_csv('F:/data32.csv',parse_dates=['Date'])
df['Date']=pd.to_datetime(df['Date'],format='%m/%d/%y')

这就是我得到错误的地方

df2=df.groupby(pd.TimeGrouper(freq='M'))

错误读取:

只对DatetimeIndex, TimedeltaIndex或PeriodIndex有效,但是得到了一个'RangeIndex'的实例

df['Date'].dt.month分组。例如,要计算平均温度,您可以执行以下操作:

import io
import pandas as pd
data = io.StringIO('''
Date,Time (HHMM),Site,Plot,Replicate,Temperature
0,2002-05-01,600,Barre Woods,16,5,4.5
1,2002-05-01,600,Barre Woods,21,7,4.5
2,2002-05-01,600,Barre Woods,31,9,6.5
3,2002-05-01,600,Barre Woods,10,2,5.3
4,2002-05-01,600,Barre Woods,2,1,4.0
5,2002-05-01,600,Barre Woods,13,4,5.5
6,2002-05-01,600,Barre Woods,11,3,5.0
7,2002-05-01,600,Barre Woods,28,8,5.0
8,2002-05-01,600,Barre Woods,18,6,4.5
9,2002-05-01,1400,Barre Woods,2,1,10.3
10,2002-05-01,1400,Barre Woods,31,9,9.0
11,2002-05-01,1400,Barre Woods,13,4,11.0
''')
df = pd.read_csv(data)
df['Date'] = pd.to_datetime(df['Date'], format='%Y-%m-%d')
df.groupby(df['Date'].dt.month)['Temperature'].mean()
输出:

Date
5    6.258333
Name: Temperature, dtype: float64

可以先使用set_index:

dfx = df.set_index('Date')

然后,您可以groupby:

dfx.groupby(lambda x : x.month).mean() #just for an example I am using .mean()

相关内容

  • 没有找到相关文章

最新更新