某些小时数的平均值



基本上,我需要的是现有表格的每小时平均值(见下文)。

输入文件最初采用 15 分钟粒度。输出应在单独的列中显示每天的平均值和每天上午 8 点至晚上 8 点的平均值。

DateTime            Value               Date Output Average (entire day)    Average (8am - 8pm)
09/01/2017 00:00    5055.414058 ->  09/01/2017      
09/01/2017 00:15    5055.414058     10/01/2017      
09/01/2017 00:30    5055.414058     11/01/2017      
09/01/2017 00:45    5055.414058     12/01/2017      
09/01/2017 01:00    5986.204028     13/01/2017      
09/01/2017 01:15    5986.204028     14/01/2017      
09/01/2017 01:30    5986.204028     15/01/2017      
09/01/2017 01:45    5986.204028     16/01/2017      
09/01/2017 02:00    7199.824865     17/01/2017      
09/01/2017 02:15    7199.824865     18/01/2017      
09/01/2017 02:30    7199.824865     19/01/2017      
09/01/2017 02:45    7199.824865     20/01/2017      
09/01/2017 03:00    9185.008333     21/01/2017      
09/01/2017 03:15    9185.008333     22/01/2017      
09/01/2017 03:30    9185.008333     23/01/2017      
…                   
13/01/2017 22:00    94080.58174             
13/01/2017 22:15    94080.58174             
13/01/2017 22:30    94080.58174             
13/01/2017 22:45    94080.58174             
13/01/2017 23:00    93231.23486             
13/01/2017 23:15    93231.23486             
13/01/2017 23:30    93231.23486             
13/01/2017 23:45    93231.23486             
14/01/2017 00:00    91619.33743             
14/01/2017 00:15    91619.33743             
14/01/2017 00:30    91619.33743             
14/01/2017 00:45    91619.33743             
14/01/2017 01:00    89894.48751             
14/01/2017 01:15    89894.48751             
14/01/2017 01:30    89894.48751             
…                   
import pandas as pd
import datetime
import numpy as np
import glob
import csv
# Local path and name of the excel file.
path = 'W:/myfolder/'
sheetname = "Forecast_" + datetime.datetime.today().strftime('%d.%m.%Y-%H')
filename = path + sheetname + ".csv"
#Create data frame of data
df = pd.read_csv(filename ,delimiter=',',engine = 'python', encoding='latin-1', index_col = False)
print(df)
table = df.groupby([df["DateTime"].dt.day, df["DateTime"].dt.hour]).mean()
print(table)

只需创建包含日期(表示为日期)的第一列,然后进行分组 + 聚合。

我会尽快发布代码:)

编辑:正如承诺的代码,

from datetime import datetime
import pandas as pd
import os
folderPath = "data/"
#Put all dataframes together
def folderIterator(folderPath):
    for item in os.listdir(folderPath):
        yield pd.read_csv("{:s}{:s}".format(folderPath, item))
dfIterator =  folderIterator(folderPath)
fullDataFrame = dfIterator.next()
for df in dfIterator:
    fullDataFrame = fullDataFrame.append(df)
#Create date column
fullDataFrame["DayCol"] = fullDataFrame["DateTime"].map(lambda x: datetime.strptime(x, '%d/%m/%Y %H:%M')).map(lambda x: x.strftime("%d/%m/%Y"))
finalDF = fullDataFrame.groupby("DayCol").mean()
print finalDF

询问有关代码:)的任何问题

使用 pd.TimeGrouperquerypd.concat

tidx = pd.date_range('2016-03-30', '2016-04-01', freq='2H')
df = pd.DataFrame(dict(value=np.random.rand(len(tidx))), tidx)
from8to8 = df.assign(hour=df.index.hour).query('8 >= hour < 9') 
    .groupby(pd.TimeGrouper('D')).value.mean().rename('8to8')
daily = df.groupby(pd.TimeGrouper('D')).value.mean().rename('day')
pd.concat([df.value, daily, from8to8], axis=1).ffill()

                        value       day      8to8
2016-03-30 00:00:00  0.671287  0.565916  0.704173
2016-03-30 02:00:00  0.997307  0.565916  0.704173
2016-03-30 04:00:00  0.335283  0.565916  0.704173
2016-03-30 06:00:00  0.722650  0.565916  0.704173
2016-03-30 08:00:00  0.794335  0.565916  0.704173
2016-03-30 10:00:00  0.992366  0.565916  0.704173
2016-03-30 12:00:00  0.206157  0.565916  0.704173
2016-03-30 14:00:00  0.480467  0.565916  0.704173
2016-03-30 16:00:00  0.389169  0.565916  0.704173
2016-03-30 18:00:00  0.326746  0.565916  0.704173
2016-03-30 20:00:00  0.458807  0.565916  0.704173
2016-03-30 22:00:00  0.416415  0.565916  0.704173
2016-03-31 00:00:00  0.344517  0.487147  0.409475
2016-03-31 02:00:00  0.095404  0.487147  0.409475
2016-03-31 04:00:00  0.412321  0.487147  0.409475
2016-03-31 06:00:00  0.384827  0.487147  0.409475
2016-03-31 08:00:00  0.810305  0.487147  0.409475
2016-03-31 10:00:00  0.052873  0.487147  0.409475
2016-03-31 12:00:00  0.936284  0.487147  0.409475
2016-03-31 14:00:00  0.303524  0.487147  0.409475
2016-03-31 16:00:00  0.347630  0.487147  0.409475
2016-03-31 18:00:00  0.787372  0.487147  0.409475
2016-03-31 20:00:00  0.989716  0.487147  0.409475
2016-03-31 22:00:00  0.380994  0.487147  0.409475
2016-04-01 00:00:00  0.091029  0.091029  0.091029

最新更新