当索引达到x999到x000时,Dataframe跳过5行



当生成15个时间框架的大表(通常大于10000行)时,我发现由于在999-1000,1999-2000,2999-3000等处缺少5行(5m跳过)而导致数据移动。

这也发生在1m的时间框架(猜测这可能发生在1h,但没有足够的蜡烛回到过去进行测试)

我得到的部分结果在这里(1s TF)

.
.
.
995  2020-06-05 21:46:35+07:00  9705.19  9706.02  9705.19  9706.02
996  2020-06-05 21:46:36+07:00  9706.02  9706.02  9706.02  9706.02
997  2020-06-05 21:46:37+07:00  9705.77  9706.02  9705.77  9706.02
998  2020-06-05 21:46:38+07:00  9706.02  9706.72  9706.02  9706.72
999  2020-06-05 21:46:39+07:00  9706.72  9706.72  9706.72  9706.72 **21:46:39** 
1000 2020-06-05 21:51:39+07:00  9698.76  9698.76  9698.76  9698.76 **21:51:39**(5m skipped)
1001 2020-06-05 21:51:40+07:00  9698.76  9698.76  9698.76  9698.76
1002 2020-06-05 21:51:41+07:00  9698.76  9698.76  9698.76  9698.76
1003 2020-06-05 21:51:42+07:00  9698.76  9698.76  9698.76  9698.76
1004 2020-06-05 21:51:43+07:00  9698.87  9698.88  9698.87  9698.88
1005 2020-06-05 21:51:44+07:00  9698.88  9698.88  9698.88  9698.88
.
.
.
1995 2020-06-05 22:08:14+07:00  9684.71  9684.71  9684.71  9684.71
1996 2020-06-05 22:08:15+07:00  9684.71  9684.71  9684.71  9684.71
1997 2020-06-05 22:08:16+07:00  9684.71  9684.71  9684.71  9684.71
1998 2020-06-05 22:08:17+07:00  9684.71  9684.71  9684.71  9684.71
1999 2020-06-05 22:08:18+07:00  9684.71  9684.71  9684.71  9684.71 **22:08:18**
2000 2020-06-05 22:13:18+07:00  9677.95  9677.95  9677.95  9677.95 **22:13:18**(5m skipped)
2001 2020-06-05 22:13:19+07:00  9677.95  9677.95  9677.95  9677.95
2002 2020-06-05 22:13:20+07:00  9677.66  9679.82  9677.66  9679.82
2003 2020-06-05 22:13:21+07:00  9679.82  9679.82  9679.82  9679.82
2004 2020-06-05 22:13:22+07:00  9679.82  9679.82  9679.82  9679.82
2005 2020-06-05 22:13:23+07:00  9679.82  9679.82  9679.82  9679.82
.
.
.

和,1m TF

.
.
.
995  2020-06-06 14:05:00+07:00  9612.17  9617.92  9612.00  9617.41
996  2020-06-06 14:06:00+07:00  9617.75  9621.15  9615.25  9618.87
997  2020-06-06 14:07:00+07:00  9618.95  9618.96  9618.32  9618.50
998  2020-06-06 14:08:00+07:00  9618.36  9619.00  9617.04  9618.60
999  2020-06-06 14:09:00+07:00  9618.61  9624.30  9618.61  9624.30 **14:09:00**
1000 2020-06-06 14:14:00+07:00  9620.23  9620.48  9619.27  9620.05 **14:14:00**(5m skipped)
1001 2020-06-06 14:15:00+07:00  9619.72  9623.24  9615.46  9615.46
1002 2020-06-06 14:16:00+07:00  9615.41  9615.69  9613.98  9613.98
1003 2020-06-06 14:17:00+07:00  9613.50  9613.63  9609.43  9610.10
1004 2020-06-06 14:18:00+07:00  9610.10  9616.13  9610.10  9615.65
1005 2020-06-06 14:19:00+07:00  9615.91  9615.91  9612.09  9613.11
.
.
.

以前有人遇到过这个问题吗?是因为我在剧本上做错了吗?

def dataframe_details_func(df_ohlcv, TIMEFRAME, LIMIT):
while(len(df_ohlcv)<LIMIT):
from_ts = df_ohlcv[-1][0] + 300000
new_ohlcv = exchange.fetch_ohlcv(PAIR, timeframe=TIMEFRAME, since=from_ts, limit=LIMIT)
df_ohlcv.extend(new_ohlcv)
df_ohlcv = pd.DataFrame(df_ohlcv, columns ['datetime','open','high','low','close','volume'])
df_ohlcv['datetime']  = pd.to_datetime(df_ohlcv['datetime'], unit='ms')
df_ohlcv.datetime = df_ohlcv.datetime.dt.tz_localize('UTC').dt.tz_convert('Asia/Bangkok')
return df_ohlcv
df_ohlcv1S = dataframe_details_func(df_ohlcv1, TIMEFRAME1S, LIMIT1S)
pd.set_option('display.max_rows', None, 'display.max_columns', None)
print(df_ohlcv1S.loc[900:1200, ['datetime', 'open', 'high', 'low', 'close']])

问题是

from_ts = df_ohlcv[-1][0] + 300000

这句话字面意思是"在上一个数据块结束5分钟后开始这个数据块"。你不会想要30万的增量。也许1000,在下一秒开始。

最新更新