我计算单词/句子之间的距离,并通过Scipy链接函数运行它们,但我需要知道如何将其与原始输入联系起来。IE。我一路上失去了标签,因为链接功能不接受。
tl; dr;我不知道如何将我的标签(var x)与链接函数的输出相关联。
X = [
"the weather is good",
"it is a rainy day",
"it is raining today",
"This has something to do with today",
"This has something to do with tomorrow",
]
# my magic function
result_set = [['this has something to do with today', 'this has something to do with tomorrow', 0.95044514149501169],
['this has something to do with today', 'it is a rainy day', 0.27315656750393491],
['this has something to do with today', 'it is raining today', 0.21404567560988952],
['this has something to do with today', 'the weather is good', 0.12284646267479128],
['this has something to do with tomorrow', 'it is a rainy day', 0.28564020977046212],
['this has something to do with tomorrow', 'it is raining today', 0.19174771483161279],
['this has something to do with tomorrow', 'the weather is good', 0.12920110156248313],
['it is a rainy day', 'it is raining today', 0.54390124565447373],
['it is a rainy day', 'the weather is good', 0.20843820300588964],
['it is raining today', 'the weather is good', 0.19278767792873652]]
sims = np.array(result_set)[:, 2]
sims = ['0.950445141495' '0.273156567504' '0.21404567561' '0.122846462675'
'0.28564020977' '0.191747714832' '0.129201101562' '0.543901245654'
'0.208438203006' '0.192787677929']
Z = linkage(sims, 'ward')
Z = [[ 0. 4. 0.12284646 2. ]
[ 1. 3. 0.19174771 2. ]
[ 2. 5. 0.27143491 3. ]
[ 6. 7. 0.70328415 5. ]]
事实证明,我正在进入距离函数,因此在倒转模拟后,结果确实有意义。以下确实正确显示标签
dendrogram(
Z,
labels=X,
orientation="right",
leaf_rotation=0, # rotates the x axis labels
leaf_font_size=8, # font size for the x axis labels
)