我想生成一个数据帧。在该数据帧中;日期";使用时间戳必须是随机生成的我想使用高斯定律生成它。我知道函数random.gauss((,我有这样的代码:
from faker import Faker
import pandas as pd
import numpy as np
from datetime import timedelta
fake_parking = [
{'Licence Plate':fake.license_plate(),
'Start_date':fake.date_time_between_dates(datetime_start='-2y', datetime_end='-1d'),
'Duration':fake.time_delta(end_datetime='+30d')
} for x in range(10000)]
df = pd.DataFrame(fake_parking)
在这里,我生成随机日期,我希望这些日期是以高斯定律为特征生成的
考虑到要生成的数据帧有三列:LicensePlate
、Start Date
和Duration
,可以执行以下
import pandas as pd
import random
import datetime as dt
import faker
fake = faker.Faker()
df = pd.DataFrame({
'LicensePlate': [fake.license_plate() for i in range(100)],
'Start Date': [dt.datetime.now() + dt.timedelta(seconds=random.gauss(0, 1000)) for i in range(100)],
'Duration': [dt.timedelta(seconds=random.gauss(0, 1000)) for i in range(100)]
})
[Out]:
LicensePlate Start Date Duration
0 XV 5129 2022-10-18 12:59:29.287650 0 days 00:24:58.640538
1 91-60124 2022-10-18 13:21:41.058608 -1 days +23:43:29.201520
2 733TBH 2022-10-18 13:26:30.057752 -1 days +23:43:59.308018
3 955 YJB 2022-10-18 13:48:31.069223 0 days 00:08:14.982752
4 0-82573 2022-10-18 13:00:43.735401 0 days 00:02:33.887666
.. ... ... ...
95 MHS 812 2022-10-18 13:29:13.169237 0 days 00:12:18.462455
96 D66-19E 2022-10-18 13:22:49.714652 -1 days +23:42:44.846897
97 SGW 257 2022-10-18 13:12:32.425996 -1 days +23:47:04.114940
98 K16-80P 2022-10-18 13:42:09.283379 -1 days +23:39:17.864417
99 28-83111 2022-10-18 13:03:26.028862 0 days 00:03:46.996096
注意:
一种是使用
faker
生成假车牌。为了确保它遵循正态/高斯分布,可以使用
random.gauss
。可以相应地调整平均值和标准偏差。