NaN在将df转换为级数时



我有一个带有OHLC数据的数据框架。我需要使用时间戳列作为索引,将收盘价输入pandas系列。

我正在从一个sqlite数据库读取到我的df:

conn = sql.connect('allStockData.db') 
price = pd.read_sql_query("SELECT * from ohlc_minutes", conn)
price['timestamp'] = pd.to_datetime(price['timestamp'])
print(price)

返回:

timestamp  open   high   low  close  volume  trade_count      vwap symbol  volume_10_day
0     2022-09-16 08:00:00+00:00  3.19  3.570  3.19  3.350   66475          458  3.404240   AAOI            NaN
1     2022-09-16 08:05:00+00:00  3.35  3.440  3.33  3.430   28925          298  3.381131   AAOI            NaN
2     2022-09-16 08:10:00+00:00  3.44  3.520  3.35  3.400   62901          643  3.445096   AAOI            NaN
3     2022-09-16 08:15:00+00:00  3.37  3.390  3.31  3.360   17943          184  3.339721   AAOI            NaN
4     2022-09-16 08:20:00+00:00  3.36  3.410  3.34  3.400   29123          204  3.383370   AAOI            NaN
...                         ...   ...    ...   ...    ...     ...          ...       ...    ...            ...
8759  2022-09-08 23:35:00+00:00  1.35  1.360  1.35  1.355    3835           10  1.350613   RUBY       515994.5
8760  2022-09-08 23:40:00+00:00  1.36  1.360  1.35  1.350    2780            7  1.353687   RUBY       515994.5
8761  2022-09-08 23:45:00+00:00  1.35  1.355  1.35  1.355    7080           11  1.350424   RUBY       515994.5
8762  2022-09-08 23:50:00+00:00  1.35  1.360  1.33  1.360   11664           30  1.351104   RUBY       515994.5
8763  2022-09-08 23:55:00+00:00  1.36  1.360  1.33  1.340   21394           32  1.348223   RUBY       515994.5
[8764 rows x 10 columns]

当我尝试用时间戳:

将close放入一个序列时
price = pd.Series(price['close'], index=price['timestamp'])

它返回一堆nan:

2022-09-16 08:00:00+00:00   NaN
2022-09-16 08:05:00+00:00   NaN
2022-09-16 08:10:00+00:00   NaN
2022-09-16 08:15:00+00:00   NaN
2022-09-16 08:20:00+00:00   NaN
..
2022-09-08 23:35:00+00:00   NaN
2022-09-08 23:40:00+00:00   NaN
2022-09-08 23:45:00+00:00   NaN
2022-09-08 23:50:00+00:00   NaN
2022-09-08 23:55:00+00:00   NaN
Name: close, Length: 8764, dtype: float64

如果我删除索引:

price = pd.Series(price['close'])

关闭符正常返回:

0       3.350
1       3.430
2       3.400
3       3.360
4       3.400
...  
8759    1.355
8760    1.350
8761    1.355
8762    1.360
8763    1.340
Name: close, Length: 8764, dtype: float64

如何使用时间戳列作为索引返回close列作为pandas系列?

这是因为price['close']有自己的索引,与timestamp不兼容。试着用.values代替:

price = pd.Series(price['close'].values, index=price['timestamp'])

我需要为索引设置时间戳,然后才能将结束作为一个系列:

conn = sql.connect('allStockData.db') 
price = pd.read_sql_query("SELECT * from ohlc_minutes", conn)
price['timestamp'] = pd.to_datetime(price['timestamp'])
price = price.set_index('timestamp')
print(price)
price = pd.Series(price['close'])
print(price)

给:

2022-09-16 08:00:00+00:00    3.350
2022-09-16 08:05:00+00:00    3.430
2022-09-16 08:10:00+00:00    3.400
2022-09-16 08:15:00+00:00    3.360
2022-09-16 08:20:00+00:00    3.400
...  
2022-09-08 23:35:00+00:00    1.355
2022-09-08 23:40:00+00:00    1.350
2022-09-08 23:45:00+00:00    1.355
2022-09-08 23:50:00+00:00    1.360
2022-09-08 23:55:00+00:00    1.340
Name: close, Length: 8764, dtype: float64

最新更新