如何为我的输出预测设置条件阈值

我有一个机器学习模型，该模型可以进行多标签文本分类。我有一个预测对象，该对象成功预测了我用作输入的文本字符串的分类。它将其预测分配给单个预测为看起来像这样的列表：

[('unrelated', 0.9684208035469055), ('curated', 0.02895800955593586)]

我觉得这可能很简单，但是从本质上我只需要为策展匹配创建一个阈值。

因此，如果策划的信心高于.90或类似的东西，我可以打印一个声明。

但是，我不知道如何指定此条件。

这是一个列表对象，因此我尝试指定索引。但是，每个索引都输出两个['label', confidence]。此外，索引切换的顺序取决于置信度。它始终首先显示最高水平的置信标签。因此，指定索引号不会有太大帮助。

single_prediction = predictor.predict(result)
df.at[0,'prediction'] = single_prediction
if single_prediction[0] >= .95:
    print('this is a match')
print(single_prediction)

您可以使用列表综合来进行：

results = [ [('curated', 0.6), ('unrelated', 0.4)],
           [('unrelated', 0.55), ('curated', 0.45)],
          [('unrelated', 0.7), ('curated', 0.3)]]
threshold = 0.4
for result in results:
    if [x[1] for x in result if x[0] == 'curated'][0] > threshold:
        print(result)

输出：

[('curated', 0.6), ('unrelated', 0.4)]
[('unrelated', 0.55), ('curated', 0.45)]

相关内容

最新更新

热门标签：