我有以下问题。我在python中使用efficient_apriori
包进行关联规则挖掘。我想把我的规则保存为panda数据帧。查看我的代码:
for rule in rules:
dict = {
"left" : [str(rule.lhs).replace(",)",")")],
"right" : [str(rule.rhs).replace(",)",")")],
"support" : [str(rule.support)],
"confidence" : [str(rule.confidence)]
}
df = pd.DataFrame.from_dict(dict)
还有比这更好的方法吗?
# this output after print(rule)
{Book1} -> {Book2} (conf: 0.541, supp: 0.057, lift: 4.417, conv: 1.914)
# this output after print(type(rule))
<class 'efficient_apriori.rules.Rule'>
使用Rule
实例的内部__dict__
:
设置MRE
# Sample from documentation
from efficient_apriori import apriori
transactions = [('eggs', 'bacon', 'soup'),
('eggs', 'bacon', 'apple'),
('soup', 'bacon', 'banana')]
itemsets, rules = apriori(transactions, min_support=0.5, min_confidence=1)
一些检查
>>> rules
[{eggs} -> {bacon}, {soup} -> {bacon}]
>>> str(rules[0])
'{eggs} -> {bacon} (conf: 1.000, supp: 0.667, lift: 1.000, conv: 0.000)'
>>> type(rules[0])
efficient_apriori.rules.Rule
>>> pd.DataFrame([rule.__dict__ for rule in rules])
lhs rhs count_full count_lhs count_rhs num_transactions
0 (eggs,) (bacon,) 2 2 3 3
1 (soup,) (bacon,) 2 2 3 3
更新
我也想保存支持和信心。
data = [dict(**rule.__dict__, confidence=rule.confidence, support=rule.support)
for rule in rules]
df = pd.DataFrame(data)
print(df)
# Output:
lhs rhs count_full count_lhs count_rhs num_transactions confidence support
0 (eggs,) (bacon,) 2 2 3 3 1.0 0.666667
1 (soup,) (bacon,) 2 2 3 3 1.0 0.666667