查找两个连续nan之间的数据

所以我有时间序列数据，我做了一些计算。我被困在试图找到一种方法来获得序列中nan点之间的日期值。

例如，该系列如下所示:

start_date  counts
3  2021-10-14 20:12:13       0
4  2021-10-14 20:21:10       1
5  2021-10-14 20:22:15       2
6  2021-10-14 20:23:14       3
7  2021-10-14 20:23:51       4
8  2021-10-14 20:39:11       0
9  2021-10-14 20:41:21       1
10 2021-10-14 20:41:45       2
11 2021-10-14 20:42:10       3
12 2021-10-14 20:46:10       4
13 2021-10-14 20:52:53       5
14 2021-10-14 20:53:10       6
15 2021-10-14 20:56:10       7
16 2021-10-14 20:57:46       8
17 2021-10-14 20:59:25       9
18 2021-10-14 21:00:12      10
19 2021-10-14 21:02:24      11
20 2021-10-14 21:06:13      12
21 2021-10-14 21:09:12      13
22 2021-10-14 21:11:35      14
23 2021-10-14 21:16:30      15
24 2021-10-14 21:19:12      16
25 2021-10-14 21:32:14       0
29 2021-10-14 23:52:07       0
30 2021-10-14 23:57:41       1
31 2021-10-15 00:06:14       2
32 2021-10-15 00:23:25       0
33 2021-10-15 00:32:09       1
34 2021-10-15 00:54:11       0
35 2021-10-15 01:03:13       1

我想在最后一个元素(在本例中是16，但可以是大于1的任何数字)的日期旁边获得第一个元素(总是= 1)的日期

所以期望的输出应该是:

2021-10-14 20:41:21 : 2021-10-14 21:19:12 
.
.
etc.

iuc

# Extract a subset of your dataframe with a clean index
df1 = df.reset_index()[['start_date', 'counts']]
# Detect 2 consecutive 0 (or NaN?) and get previous row
idx2 = df1.loc[df1['counts'].eq(0)
& df1['counts'].shift(-1).eq(0), 'counts'].index - 1
# Find the counts of the row then subtract to idx2
idx1 = idx2 - df1.loc[idx2, 'counts'].values + 1
# Join the 2 indexes
out = pd.concat([df1.loc[idx1, 'start_date'].reset_index(drop=True),
df1.loc[idx2, 'start_date'].reset_index(drop=True)], axis=1)

输出:

>>> out
start_date          start_date
0 2021-10-14 20:41:21 2021-10-14 21:19:12

相关内容

最新更新

热门标签：