我使用sklearn.decomposition
中的PCA
类来降低特征空间的维数,以便绘制该特征空间。
我想知道以下内容:在应用PCA
类的fit
和transform
方法后,我得到了文档中所述形状为(n_samples, n_components)
的数组X_transformed
。X_transformed
列的顺序是否按解释方差的数量排序?在文档中,它说PCA.components_
是按解释方差排序的,所以我假设X_transformed
的列也是如此,但如果我错了,请纠正我。
小例子:
from sklearn.decomposition import PCA
pca = PCA()
pca.fit(X) # X is an array containing my original features. X.shape=(n_samples, n_features)
X_transformed = pca.transfom(X) # X_transformed.shape=(n_samples, n_components). Are X_transformed's columns sorted by explained variance?
谢谢!
嗯,也许我只是想测试一下
from sklearn.decomposition import PCA
import numpy as np
pca_2 = PCA(n_components=2)
X_transformed_2 = pca_2.fit_transform(X)
# X_transformed_2 hold two components with most variance explained
pca_10 = PCA(n_components=10)
X_transformed_10 = pca_10.fit_transform(X)
# X_transformed_10 hold 10 components with most variance explained
# Hypothesis: If the first 2 components in X_transformed_10 are ordered by explained variance, it's first 2 columns should equal X_transformed_2
np.array_equal(X_transformed_2, X_transformed_10[:, 2]) ## returns True