这是我尝试使用功能工具时的数据集
data
Unit Price Customer Name Product Category Region Profit Quantity ordered new Sales Order ID
0 2.88 Janice Fletcher Office Supplies Central 1.320000 2 5.90 88525
1 2.84 Bonnie Potter Office Supplies West 4.560000 4 13.01 88522
2 6.68 Bonnie Potter Office Supplies West -47.640000 7 49.92 88523
3 5.68 Bonnie Potter Office Supplies West -30.510000 7 41.64 88523
4 205.99 Bonnie Potter Technology West 998.202300 8 1446.67 88523
9426 rows × 8 columns
returns
Order ID Status
0 65 Returned
1 612 Returned
2 614 Returned
3 678 Returned
4 710 Returned
1634 rows × 2 columns
users
Region Manager
0 Central Chris
1 East Erin
2 South Sam
3 West William
entities = {
"data" : (data, "Order ID"),
"returns" : (returns, "Status"),
"users" : (users, "Manager")
}
relationships = [
('data', 'Order ID', 'returns', 'Order ID'),
('data', 'Region', 'users', 'Region')
]
combined_table, features_defs = ft.dfs(entities = entities,
relationships = relationships,
target_entity = "Unit Price")
combined_table
这是我收到的错误消息
AssertionError: Index is not unique on dataframe (Entity data)
谁能告诉我我做得不正确? 在此处输入图像描述
每个实体上的索引值必须是唯一的。在数据实体上,所有订单 ID 值的 indize 均为空。
此外:
target_entity = "Unit Price"
将不起作用,因为您必须提供实体(数据、返回或用户(而不是表/实体的列。特征工具每次运行仅在一个表/实体上生成特征,而不是在所有表/实体上生成特征。