熊猫属性错误:'DataFrame'对象没有属性'Timestamp'



所以我想用我的脚本得到每月的总和,但我总是得到AttributeError,我不明白。时间戳列确实存在于我的combined_csv中
我确信这一行是造成问题的原因,因为我之前测试了所有其他代码
属性错误:"DataFrame"对象没有属性"Timestamp">
我将感谢您的帮助,谢谢

import os
import glob
import pandas as pd
# set working directory
os.chdir("Path to CSVs")
# find all csv files in the folder
# use glob pattern matching -> extension = 'csv'
# save result in list -> all_filenames
extension = 'csv'
all_filenames = [i for i in glob.glob('*.{}'.format(extension))]
# print(all_filenames)
# combine all files in the list
combined_csv = pd.concat([pd.read_csv(f, sep=';') for f in all_filenames])
# Format CSV
# Transform Timestamp column into datetime
combined_csv['Timestamp'] = pd.to_datetime(combined_csv.Timestamp)
# Read out first entry of every day of every month
combined_csv = round(combined_csv.resample('D', on='Timestamp')['HtmDht_Energy'].agg(['first']))
# To get the yield of day i have to subtract day 2 HtmDht_Energy - day 1 HtmDht_Energy
combined_csv["dailyYield"] = combined_csv["first"] - combined_csv["first"].shift()
# combined_csv.reset_index()
# combined_csv.index.set_names(["year", "month"], inplace=True)
combined_csv["monthlySum"] = combined_csv.groupby([combined_csv.Timestamp.dt.year, combined_csv.Timestamp.dt.month]).sum()

组合_csv.columns 的输出

Index(['Timestamp', 'teHst0101', 'teHst0102', 'teHst0103', 'teHst0104',
'teHst0105', 'teHst0106', 'teHst0107', 'teHst0201', 'teHst0202',
'teHst0203', 'teHst0204', 'teHst0301', 'teHst0302', 'teHst0303',
'teHst0304', 'teAmb', 'teSolFloHexHst', 'teSolRetHexHst',
'teSolCol0501', 'teSolCol1001', 'teSolCol1501', 'vfSol', 'prSolRetSuc',
'rdGlobalColAngle', 'gSolPump01_roActual', 'gSolPump02_roActual',
'gHstPump03_roActual', 'gHstPump04_roActual', 'gDhtPump06_roActual',
'gMB01_isOpened', 'gMB02_isOpened', 'gCV01_posActual',
'gCV02_posActual', 'HtmDht_Energy', 'HtmDht_Flow', 'HtmDht_Power',
'HtmDht_Volume', 'HtmDht_teFlow', 'HtmDht_teReturn', 'HtmHst_Energy',
'HtmHst_Flow', 'HtmHst_Power', 'HtmHst_Volume', 'HtmHst_teFlow',
'HtmHst_teReturn', 'teSolColDes', 'teHstFloDes'],
dtype='object')

回溯
当我用
combined_csv[quot;monthlySum"]=combined_csv.groupby([combined_csv['Timestamp'].dt.year,combined_sv['Timestamp']].dt.month](.sum((选择它时

Traceback (most recent call last):
File "D:UserswinkPycharmProjectscsvToExcelmain.py", line 28, in <module>
combined_csv["monthlySum"] = combined_csv.groupby([combined_csv['Timestamp'].dt.year, combined_csv['Timestamp'].dt.month]).sum()
File "D:UserswinkPycharmProjectscsvToExcelvenvlibsite-packagespandascoreframe.py", line 3024, in __getitem__
indexer = self.columns.get_loc(key)
File "D:UserswinkPycharmProjectscsvToExcelvenvlibsite-packagespandascoreindexesbase.py", line 3082, in get_loc
raise KeyError(key) from err
KeyError: 'Timestamp'

使用mustafas解决方案进行回溯

Traceback (most recent call last):
File "C:UserswinklermPycharmProjectscsvToExcelvenvlibsite-packagespandascoreframe.py", line 3862, in reindexer
value = value.reindex(self.index)._values
File "C:UserswinklermPycharmProjectscsvToExcelvenvlibsite-packagespandasutil_decorators.py", line 312, in wrapper
return func(*args, **kwargs)
File "C:UserswinklermPycharmProjectscsvToExcelvenvlibsite-packagespandascoreframe.py", line 4176, in reindex
return super().reindex(**kwargs)
File "C:UserswinklermPycharmProjectscsvToExcelvenvlibsite-packagespandascoregeneric.py", line 4811, in reindex
return self._reindex_axes(
File "C:UserswinklermPycharmProjectscsvToExcelvenvlibsite-packagespandascoreframe.py", line 4022, in _reindex_axes
frame = frame._reindex_index(
File "C:UserswinklermPycharmProjectscsvToExcelvenvlibsite-packagespandascoreframe.py", line 4038, in _reindex_index
new_index, indexer = self.index.reindex(
File "C:UserswinklermPycharmProjectscsvToExcelvenvlibsite-packagespandascoreindexesmulti.py", line 2492, in reindex
target = MultiIndex.from_tuples(target)
File "C:UserswinklermPycharmProjectscsvToExcelvenvlibsite-packagespandascoreindexesmulti.py", line 175, in new_meth
return meth(self_or_cls, *args, **kwargs)
File "C:UserswinklermPycharmProjectscsvToExcelvenvlibsite-packagespandascoreindexesmulti.py", line 531, in from_tuples
arrays = list(lib.tuples_to_object_array(tuples).T)
File "pandas_libslib.pyx", line 2527, in pandas._libs.lib.tuples_to_object_array
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long long'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:UserswinklermPycharmProjectscsvToExcelmain.py", line 28, in <module>
combined_csv["monthlySum"] = combined_csv.groupby([combined_csv.Timestamp.dt.year, combined_csv.Timestamp.dt.month]).sum()
File "C:UserswinklermPycharmProjectscsvToExcelvenvlibsite-packagespandascoreframe.py", line 3163, in __setitem__
self._set_item(key, value)
File "C:UserswinklermPycharmProjectscsvToExcelvenvlibsite-packagespandascoreframe.py", line 3242, in _set_item
value = self._sanitize_column(key, value)
File "C:UserswinklermPycharmProjectscsvToExcelvenvlibsite-packagespandascoreframe.py", line 3888, in _sanitize_column
value = reindexer(value).T
File "C:UserswinklermPycharmProjectscsvToExcelvenvlibsite-packagespandascoreframe.py", line 3870, in reindexer
raise TypeError(
TypeError: incompatible index of inserted column with frame index

此行使Timestamp列成为combined_csv:的索引

combined_csv = round(combined_csv.resample('D', on='Timestamp')['HtmDht_Energy'].agg(['first']))

因此在尝试访问CCD_ 3时会出现错误。

补救措施是reset_index,所以你可以试试这个:

combined_csv = round(combined_csv.resample('D', on='Timestamp')['HtmDht_Energy'].agg(['first'])).reset_index()

这将把Timestamp列从索引中带回正常列,然后您可以访问它


旁注:
combined_csv["dailyYield"] = combined_csv["first"] - combined_csv["first"].shift()

相当于

combined_csv["dailyYield"] = combined_csv["first"].diff()

相关内容

最新更新