我想在两个固定日期(2018-06-01至2018-06-11)之间创建100个分钟级别的随机时间戳,并将其填充到数据帧中。最终的数据框看起来像这样(这只是一个演示):
+-----+---------------------+
| | timestamp |
+-----+---------------------+
| 0 | 2018-06-01 04:26:00 |
+-----+---------------------+
| 1 | 2018-06-01 05:55:00 |
+-----+---------------------+
| 2 | 2018-06-01 06:11:00 |
+-----+---------------------+
| 3 | 2018-06-01 07:56:00 |
+-----+---------------------+
| | ... |
+-----+---------------------+
| | ... |
+-----+---------------------+
| | ... |
+-----+---------------------+
| 97 | 2018-06-11 19:28:00 |
+-----+---------------------+
| 98 | 2018-06-11 20:47:00 |
+-----+---------------------+
| 99 | 2018-06-11 21:47:00 |
+-----+---------------------+
| 100 | 2018-06-11 22:54:00 |
+-----+---------------------+
| 101 | 2018-06-11 23:20:00 |
+-----+---------------------+
我知道如何通过执行pd.date_range(start="2018-06-01 00:00:00",end="2018-06-12 00:00:00", freq="T")
来创建两个日期之间的所有分钟,但不确定如何准确地随机选择100个时间戳(然后将它们填回数据帧)。
numpy.random。可以使用
import numpy as np
import pandas as pd
d = pd.date_range(start="2018-06-01 00:00:00",end="2018-06-12 00:00:00", freq="T")
df = pd.DataFrame(np.random.choice(d, 100), columns=['timestamp'])
输出timestamp
0 2018-06-08 20:20:00
1 2018-06-11 22:15:00
2 2018-06-10 19:25:00
3 2018-06-05 04:49:00
4 2018-06-06 08:06:00
... ...
95 2018-06-06 06:51:00
96 2018-06-03 00:50:00
97 2018-06-03 10:06:00
98 2018-06-01 08:12:00
99 2018-06-05 23:39:00