按na或缺失行将具有DatetimeIndex的Pandas DataFrame拆分为块



我有一个带有DatetimeIndex的Pandas DataFrame,希望将其拆分为连续连接的行块,删除nan行。

Temperature  Humidity
2020-01-01 00:00:00+00:00  20           40 
2020-01-01 00:01:00+00:00  21           40
2020-01-01 00:02:00+00:00  NaN          NaN
2020-01-01 00:03:00+00:00  22           41
2020-01-01 00:04:00+00:00  NaN          NaN
2020-01-01 00:05:00+00:00  NaN          NaN
2020-01-01 00:06:00+00:00  NaN          NaN
2020-01-01 00:07:00+00:00  21           41
2020-01-01 00:08:00+00:00  21           41
2020-01-01 00:09:00+00:00  21           42

结果应该是以下三个数据帧的列表:

Temperature  Humidity
2020-01-01 00:00:00+00:00  20           40 
2020-01-01 00:01:00+00:00  21           40
Temperature  Humidity
2020-01-01 00:03:00+00:00  22           41
Temperature  Humidity
2020-01-01 00:07:00+00:00  21           41
2020-01-01 00:08:00+00:00  21           41
2020-01-01 00:09:00+00:00  21           42

有什么帮助吗?

让我们尝试使用cumsumisnull创建groupby密钥

d = {x : y for x , y in df.dropna().groupby(df.isnull().cumsum().sum(1))}
d[0]
Temperature  Humidity
2020-01-0100:00:00+00:00         20.0      40.0
2020-01-0100:01:00+00:00         21.0      40.0

让我们尝试使用cumsum来识别块:

na = df.Temperature.isna().cumsum()
for i,d in df.loc[na.eq(0) | na.duplicated()].groupby(na):
print(d)

输出:

Temperature  Humidity
2020-01-01 00:00:00+00:00         20.0      40.0
2020-01-01 00:01:00+00:00         21.0      40.0
Temperature  Humidity
2020-01-01 00:03:00+00:00         22.0      41.0
Temperature  Humidity
2020-01-01 00:07:00+00:00         21.0      41.0
2020-01-01 00:08:00+00:00         21.0      41.0
2020-01-01 00:09:00+00:00         21.0      42.0

最新更新