每天和每个位置出现问题的频率如何?



我有这样的数据帧:

Date                  Location_ID   Problem_ID  
---------------------+------------+----------  
2013-01-02 10:00:00  | 1          |  43  
2012-08-09 23:03:01  | 5          |  2  
...

如何计算问题每天和每个位置发生的频率?

使用groupbyDate列转换为dates,或使用聚合sizeGrouper

print (df)
Date  Location_ID  Problem_ID
0  2013-01-02 10:00:00            1          43
1  2012-08-09 23:03:01            5           2
#if necessary convert column to datetimes 
df['Date'] = pd.to_datetime(df['Date'])
df1 = df.groupby([df['Date'].dt.date, 'Location_ID']).size().reset_index(name='count')
print (df1)
Date  Location_ID  count
0  2012-08-09            5      1
1  2013-01-02            1      1

或:

df1 = (df.groupby([pd.Grouper(key='Date', freq='D'), 'Location_ID'])
.size()
.reset_index(name='count'))

如果第一列为索引:

print (df)
Location_ID  Problem_ID
Date                                        
2013-01-02 10:00:00            1          43
2012-08-09 23:03:01            5           2

df.index = pd.to_datetime(df.index)
df1 = (df.groupby([df.index.date, 'Location_ID'])
.size()
.reset_index(name='count')
.rename(columns={'level_0':'Date'}))
print (df1)
Date  Location_ID  count
0  2012-08-09            5      1
1  2013-01-02            1      1

df1 = (df.groupby([pd.Grouper(level='Date', freq='D'), 'Location_ID'])
.size()
.reset_index(name='count'))

最新更新