在实例化一个MultiLabelBinarizer
之后,我需要它inverse_transform
方法用于我在其他地方构建的矩阵。不幸
import numpy as np
from sklearn.preprocessing import MultiLabelBinarizer
mlb = MultiLabelBinarizer(classes=['a', 'b', 'c'])
A = np.array([[1, 0, 0], [1, 0, 1], [0, 1, 0], [1, 1, 1]])
y = mlb.inverse_transform(A)
产量AttributeError: 'MultiLabelBinarizer' object has no attribute 'classes_'
我注意到,如果我在mlb
的实例化之后添加这一行,
mlb.fit_transform([(c,) for c in ['a', 'b', 'c']])
错误消失。我猜这是因为fit_transform
设置了 classes_
属性的值,但我希望它在实例化时完成,因为我提供了一个classes
参数。
我正在使用 sklearn 版本 0.17.1 和 python 2.7.6。我做错了什么吗?
如果要在 MultiLabelBinarizer
的实例中设置属性classes_
,也可以像这样快速破解:
mlb = MultiLabelBinarizer().fit(['a', 'b', 'c'])
因为就像marmouset说的那样,只有fit
和fit_transorm
似乎符合classes_
属性。此外,scikit-learn.org http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MultiLabelBinarizer.html 的文档明确指定该方法fit
可以返回MultiLabelBinarizer
的实例。
def fit(self, y):
"""Fit the label sets binarizer, storing `classes_`
Parameters
----------
y : iterable of iterables
A set of labels (any orderable and hashable object) for each
sample. If the `classes` parameter is set, `y` will not be
iterated.
Returns
-------
self : returns this MultiLabelBinarizer instance
"""
似乎是按原样实现的 https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/preprocessing/label.py#L636,.fit是定义classes_属性的唯一方法。 classes_ 没有定义为构造函数中类的副本,并且考虑到注释中给出的定义,它并不意味着如此;你可以警告作者。
class MultiLabelBinarizer(BaseEstimator, TransformerMixin):
"""Transform between iterable of iterables and a multilabel format
Although a list of sets or tuples is a very intuitive format for multilabel
data, it is unwieldy to process. This transformer converts between this
intuitive format and the supported multilabel format: a (samples x classes)
binary matrix indicating the presence of a class label.
Parameters
----------
classes : array-like of shape [n_classes] (optional)
Indicates an ordering for the class labels
sparse_output : boolean (default: False),
Set to true if output binary array is desired in CSR sparse format
Attributes
----------
classes_ : array of labels
A copy of the `classes` parameter where provided,
or otherwise, the sorted set of classes found when fitting.