尝试执行 SDV(合成数据保管库)演示并收到错误:类型错误:无法键入从 [datetime64[ns]] 到 [int3



我首先要说我不是Python开发人员。 但是我需要合成数据,并尝试使用合成数据保险箱(https://github.com/sdv-dev/SDV(。

我安装了Python 3.7(在Windows上,我现在在我的笔记本电脑上做这件事,同时学习它是如何工作的(。

蟒蛇 --版本
蟒蛇 3.7.6

我能够使用 pip 下载 sdv 包,我可以运行前几行演示代码来加载和查看元数据和演示表(。 但是,当我在演示中到达这些行时:

sdv = SDV()
sdv.fit(metadata, tables)

我收到以下错误:

类型错误: 无法键入从 [datetime64[ns]] 到 [int32] 的类似日期时间

我根本没有修改 git 中的任何代码,也没有尝试过我自己的任何代码。 我实际上只是想让演示按照自述文件中的说明工作。 我刚刚安装了该软件包,并且正在完成第一个示例。 有人尝试过这个并遇到同样的问题吗? 关于我可以做些什么来通过此错误的任何想法?

全栈跟踪为:

sdv.fit(metadata, tables)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:toolsPython3.7libsite-packagessdvsdv.py", line 69, in fit
self.modeler.model_database(tables)
File "C:toolsPython3.7libsite-packagessdvmodeler.py", line 128, in model_database
self.cpa(table_name, tables)
File "C:toolsPython3.7libsite-packagessdvmodeler.py", line 99, in cpa
child_table = self.cpa(child_name, tables, child_key)
File "C:toolsPython3.7libsite-packagessdvmodeler.py", line 99, in cpa
child_table = self.cpa(child_name, tables, child_key)
File "C:toolsPython3.7libsite-packagessdvmodeler.py", line 92, in cpa
extended = self.metadata.transform(table_name, table)
File "C:toolsPython3.7libsite-packagessdvmetadata.py", line 477, in transform
hyper_transformer.fit(data[fields])
File "C:toolsPython3.7libsite-packagesrdthyper_transformer.py", line 128, in fit
transformer.fit(column)
File "C:toolsPython3.7libsite-packagesrdttransformersdatetime.py", line 55, in fit
transformed = self._transform(data)
File "C:toolsPython3.7libsite-packagesrdttransformersdatetime.py", line 40, in _transform
integers = datetimes.astype(int).astype(float).values
File "C:toolsPython3.7libsite-packagespandascoregeneric.py", line 5691, in astype
**kwargs)
File "C:toolsPython3.7libsite-packagespandascoreinternalsmanagers.py", line 531, in astype
return self.apply('astype', dtype=dtype, **kwargs)
File "C:toolsPython3.7libsite-packagespandascoreinternalsmanagers.py", line 395, in apply
applied = getattr(b, f)(**kwargs)
File "C:toolsPython3.7libsite-packagespandascoreinternalsblocks.py", line 534, in astype
**kwargs)
File "C:toolsPython3.7libsite-packagespandascoreinternalsblocks.py", line 2139, in _astype
return super(DatetimeBlock, self)._astype(dtype=dtype, **kwargs)
File "C:toolsPython3.7libsite-packagespandascoreinternalsblocks.py", line 633, in _astype
values = astype_nansafe(values.ravel(), dtype, copy=True)
File "C:toolsPython3.7libsite-packagespandascoredtypescast.py", line 646, in astype_nansafe
to_dtype=dtype))
TypeError: cannot astype a datetimelike from [datetime64[ns]] to [int32]

以下是我的会话的完整输出:


from sdv import load_demo
metadata, tables = load_demo(metadata=True)
metadata.to_dict()
{
"tables": {
"users": {
"primary_key": "user_id",
"fields": {
"user_id": {
"type": "id",
"subtype": "integer"
},
"country": {
"type": "categorical"
},
"gender": {
"type": "categorical"
},
"age": {
"type": "numerical",
"subtype": "integer"
}
}
},
"sessions": {
"primary_key": "session_id",
"fields": {
"session_id": {
"type": "id",
"subtype": "integer"
},
"user_id": {
"ref": {
"field": "user_id",
"table": "users"
},
"type": "id",
"subtype": "integer"
},
"device": {
"type": "categorical"
},
"os": {
"type": "categorical"
}
}
},
"transactions": {
"primary_key": "transaction_id",
"fields": {
"transaction_id": {
"type": "id",
"subtype": "integer"
},
"session_id": {
"ref": {
"field": "session_id",
"table": "sessions"
},
"type": "id",
"subtype": "integer"
},
"timestamp": {
"type": "datetime",
"format": "%Y-%m-%d"
},
"amount": {
"type": "numerical",
"subtype": "float"
},
"approved": {
"type": "boolean"
}
}
}
}
}

>>> tables

{'users':    user_id country gender  age
0        0     USA      M   34
1        1      UK      F   23
2        2      ES   None   44
3        3      UK      M   22
4        4     USA      F   54
5        5      DE      M   57
6        6      BG      F   45
7        7      ES   None   41
8        8      FR      F   23
9        9      UK   None   30, 'sessions':    session_id  user_id  device       os
0           0        0  mobile  android
1           1        1  tablet      ios
2           2        1  tablet  android
3           3        2  mobile  android
4           4        4  mobile      ios
5           5        5  mobile  android
6           6        6  mobile      ios
7           7        6  tablet      ios
8           8        6  mobile      ios
9           9        8  tablet      ios, 'transactions':    transaction_id  session_id           timestamp  amount  approved
0               0           0 2019-01-01 12:34:32   100.0      True
1               1           0 2019-01-01 12:42:21    55.3      True
2               2           1 2019-01-07 17:23:11    79.5      True
3               3           3 2019-01-10 11:08:57   112.1     False
4               4           5 2019-01-10 21:54:08   110.0     False
5               5           5 2019-01-11 11:21:20    76.3      True
6               6           7 2019-01-22 14:44:10    89.5      True
7               7           8 2019-01-23 10:14:09   132.1     False
8               8           9 2019-01-27 16:09:17    68.0      True
9               9           9 2019-01-29 12:10:48    99.9      True}

metadata.visualize()
<graphviz.dot.Digraph object at 0x00000196E8755488>
from sdv import SDV
sdv = SDV()
sdv.fit(metadata, tables)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:toolsPython3.7libsite-packagessdvsdv.py", line 69, in fit
self.modeler.model_database(tables)
File "C:toolsPython3.7libsite-packagessdvmodeler.py", line 128, in model_database
self.cpa(table_name, tables)
File "C:toolsPython3.7libsite-packagessdvmodeler.py", line 99, in cpa
child_table = self.cpa(child_name, tables, child_key)
File "C:toolsPython3.7libsite-packagessdvmodeler.py", line 99, in cpa
child_table = self.cpa(child_name, tables, child_key)
File "C:toolsPython3.7libsite-packagessdvmodeler.py", line 92, in cpa
extended = self.metadata.transform(table_name, table)
File "C:toolsPython3.7libsite-packagessdvmetadata.py", line 477, in transform
hyper_transformer.fit(data[fields])
File "C:toolsPython3.7libsite-packagesrdthyper_transformer.py", line 128, in fit
transformer.fit(column)
File "C:toolsPython3.7libsite-packagesrdttransformersdatetime.py", line 55, in fit
transformed = self._transform(data)
File "C:toolsPython3.7libsite-packagesrdttransformersdatetime.py", line 40, in _transform
integers = datetimes.astype(int).astype(float).values
File "C:toolsPython3.7libsite-packagespandascoregeneric.py", line 5691, in astype
**kwargs)
File "C:toolsPython3.7libsite-packagespandascoreinternalsmanagers.py", line 531, in astype
return self.apply('astype', dtype=dtype, **kwargs)
File "C:toolsPython3.7libsite-packagespandascoreinternalsmanagers.py", line 395, in apply
applied = getattr(b, f)(**kwargs)
File "C:toolsPython3.7libsite-packagespandascoreinternalsblocks.py", line 534, in astype
**kwargs)
File "C:toolsPython3.7libsite-packagespandascoreinternalsblocks.py", line 2139, in _astype
return super(DatetimeBlock, self)._astype(dtype=dtype, **kwargs)
File "C:toolsPython3.7libsite-packagespandascoreinternalsblocks.py", line 633, in _astype
values = astype_nansafe(values.ravel(), dtype, copy=True)
File "C:toolsPython3.7libsite-packagespandascoredtypescast.py", line 646, in astype_nansafe
to_dtype=dtype))
TypeError: cannot astype a datetimelike from [datetime64[ns]] to [int32]

实际上,我找到了一个解决方案 - 不是 Python 开发人员,不确定它是否是最好的解决方案,但它清除了错误。

在第 41 行的 datetime.py 代码中,我更改了:

integers = datetimes.astype(int).astype(float).values

integers = datetimes.astype(np.int64).astype(float).values

不过,我认为有一种方法可以在不更改项目代码的情况下解决此问题(这意味着这不是我的代码,这是我下载的包(,但是我现在能够继续我的研究。

相关内容

最新更新