我想根据Data
值对df
进行切片以形成df1
。
df
:
Id Timestamp Data
106272 106273 2013-09-10 16:40:40.467 86.0
106273 106274 2013-09-10 16:40:41.267 86.0
106274 106275 2013-09-10 16:40:42.053 59.0
106275 106276 2013-09-10 16:40:42.857 52.0
106278 106279 2013-09-10 16:41:00.173 61.5
然后,我用代码df
切片Data
是否在 [20, 100] 范围内:
df_copy = df.copy()
df1 = df_copy[(df_copy["Data"]>=20) & (df_copy["Data"]<=100)]
这工作正常。
然后我想按Timestamp
date
对df
和df1
进行分组:
import datetime
df['Date'] = [datetime.datetime.date(d) for d in df['Timestamp']]
x = pd.DataFrame(df.groupby(['Date']).size())
x.columns = ['values']
# -----------------------------------
df1['Date'] = [datetime.datetime.date(d) for d in df1['Timestamp']]
x1 = pd.DataFrame(df1.groupby(['Date']).size())
x1.columns = ['values']
但是,它仅适用于df
,但捕获了df1
的错误:
TypeError Traceback (most recent call last)
<ipython-input-15-2ddde01b3d65> in <module>
---> 12 df1['Date'] = [datetime.datetime.date(d) for d in df1['Timestamp']]
13
14 x1 = pd.DataFrame(df1.groupby(['Date']).size())
TypeError: tuple indices must be integers or slices, not str
为什么?
不需要具有 lsit 理解和构造函数DataFrame
的新列,请将Series.dt.date
与Series.to_frame
一起使用:
x = (df.groupby(df['Timestamp'].dt.date)
.size()
.to_frame('values'))
df1 = df[(df["Data"]>=20) & (df["Data"]<=100)].copy()
x1 = (df1.groupby(df1['Timestamp'].dt.date)
.size()
.to_frame('values'))