我使用R的函数PCA
研究主成分分析。
这是为了使问题可重现:
> dput(DATA_FINAL[1:50,])
structure(list(DataCRMSanoflore.Year_Sales = c(2, 1, 2, 1, 2,
1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1,
1, 2, 1, 1, 1, 2, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1,
1, 1, 1), DataCRMSanoflore.Month_Sales = c(9, 9, 2, 5, 9, 4,
7, 9, 3, 9, 7, 12, 3, 11, 3, 12, 3, 3, 6, 3, 4, 7, 5, 3, 5, 8,
8, 1, 9, 5, 4, 1, 10, 9, 5, 4, 9, 3, 2, 12, 9, 4, 4, 3, 6, 8,
6, 4, 4, 12), DataCRMSanoflore.Date_Sales = c(13, 3, 10, 22,
23, 26, 13, 1, 12, 2, 25, 11, 10, 26, 9, 4, 10, 18, 9, 9, 1,
7, 30, 9, 14, 24, 4, 2, 10, 17, 2, 28, 22, 17, 4, 14, 22, 30,
2, 5, 29, 13, 2, 10, 25, 5, 10, 23, 1, 6), DataCRMSanoflore.HOURS_INSCR = c(17,
14, 18, 17, 16, 11, 22, 14, 23, 17, 9, 21, 18, 16, 19, 12, 11,
17, 16, 21, 20, 11, 16, 18, 14, 19, 22, 17, 14, 10, 22, 15, 13,
19, 13, 21, 16, 19, 23, 19, 11, 21, 11, 22, 20, 13, 11, 15, 17,
15), DataCRMSanoflore.Year_Creation_Sales = c(2, 1, 2, 1, 2,
1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1,
2, 2, 1, 1, 1, 2, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1,
1, 1, 1), DataCRMSanoflore.Month_Creation_Sales = c(9, 9, 2,
10, 10, 9, 7, 9, 12, 9, 7, 12, 3, 11, 4, 2, 6, 3, 6, 10, 4, 7,
6, 3, 5, 8, 3, 1, 9, 7, 4, 11, 11, 9, 5, 4, 9, 3, 2, 12, 10,
4, 4, 3, 10, 8, 6, 4, 4, 12), DataCRMSanoflore.Day_Creation_Sales = c(13,
11, 15, 2, 31, 26, 23, 1, 5, 2, 25, 16, 10, 27, 13, 7, 3, 18,
9, 8, 27, 7, 8, 18, 18, 24, 6, 2, 26, 4, 4, 24, 16, 17, 12, 15,
22, 30, 10, 5, 1, 14, 2, 10, 5, 5, 10, 27, 25, 6), DataCRMSanoflore.Year_Validation_Sales = c(2,
1, 2, 1, 2, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1,
2, 1, 1, 1, 2, 2, 1, 1, 1, 2, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1,
1, 1, 1, 1, 1, 1, 1), DataCRMSanoflore.Month_Validation_Sales = c(9,
9, 2, 10, 11, 10, 7, 9, 12, 9, 7, 12, 3, 12, 4, 2, 6, 3, 6, 10,
4, 7, 6, 3, 5, 8, 3, 1, 10, 7, 4, 11, 11, 9, 5, 4, 9, 4, 2, 12,
10, 4, 4, 3, 10, 8, 6, 4, 4, 12), DataCRMSanoflore.Day_Validation_Sales = c(15,
14, 16, 3, 3, 1, 27, 2, 6, 5, 27, 21, 19, 1, 27, 8, 5, 21, 10,
9, 30, 9, 9, 21, 26, 27, 7, 4, 1, 6, 15, 25, 17, 18, 13, 20,
29, 1, 11, 7, 2, 16, 3, 20, 6, 6, 13, 29, 29, 8), DataCRMSanoflore.AGE_CUSTUMER = c(33,
37, 24, 34, 32, 46, 52, 60, 44, 55, 37, 29, 34, 30, 30, 31, 37,
57, 48, 44, 42, 28, 34, 43, 45, 33, 37, 53, 43, 35, 55, 62, 60,
57, 33, 51, 32, 51, 35, 54, 42, 47, 59, 33, 45, 35, 36, 54, 28,
42), DataCRMSanoflore.MEAN_PURCHASE = c(0, 71.75, 50.7142857142857,
18.6666666666667, 0, 0, 54.7, 22, 0.666666666666667, 38, 6.5,
0, 83.3333333333333, 0.333333333333333, 44.3333333333333, 25.7777777777778,
24.1818181818182, 23.3846153846154, 35.5294117647059, 21.6363636363636,
1.125, 40.6428571428571, 0, 46.8461538461538, 6, 8.66666666666667,
18.4, 16.9285714285714, 15.0666666666667, 110.25, 0, 8.85714285714286,
0, 36.5, 21.5, 18.5714285714286, 28.125, 8.38888888888889, 101.333333333333,
0, 2, 0, 20.9166666666667, 69.1428571428571, 16.6666666666667,
1.5, 87.1666666666667, 0, 48.25, 13.3333333333333), DataCRMSanoflore.NUMBER_GIFTS = c(1,
1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 1, 3, 4, 3, 4, 2, 2,
1, 2, 1, 1, 1, 2, 4, 1, 1, 1, 1, 3, 1, 3, 2, 4, 1, 1, 1, 1, 2,
2, 1, 1, 1, 1, 2, 3), DataCRMSanoflore.Year_Sales = c(2L, 1L,
2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L,
1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), DataCRMSanoflore.Month_Sales = c(9L, 9L, 2L, 5L, 9L, 4L, 7L,
9L, 3L, 9L, 7L, 12L, 3L, 11L, 3L, 12L, 3L, 3L, 6L, 3L, 4L, 7L,
5L, 3L, 5L, 8L, 8L, 1L, 9L, 5L, 4L, 1L, 10L, 9L, 5L, 4L, 9L,
3L, 2L, 12L, 9L, 4L, 4L, 3L, 6L, 8L, 6L, 4L, 4L, 12L), DataCRMSanoflore.Date_Sales = c(13L,
3L, 10L, 22L, 23L, 26L, 13L, 1L, 12L, 2L, 25L, 11L, 10L, 26L,
9L, 4L, 10L, 18L, 9L, 9L, 1L, 7L, 30L, 9L, 14L, 24L, 4L, 2L,
10L, 17L, 2L, 28L, 22L, 17L, 4L, 14L, 22L, 30L, 2L, 5L, 29L,
13L, 2L, 10L, 25L, 5L, 10L, 23L, 1L, 6L), DataCRMSanoflore.Year_Creation_Sales = c(2L,
1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L,
1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), DataCRMSanoflore.Month_Creation_Sales = c(9L, 9L, 2L, 10L,
10L, 9L, 7L, 9L, 12L, 9L, 7L, 12L, 3L, 11L, 4L, 2L, 6L, 3L, 6L,
10L, 4L, 7L, 6L, 3L, 5L, 8L, 3L, 1L, 9L, 7L, 4L, 11L, 11L, 9L,
5L, 4L, 9L, 3L, 2L, 12L, 10L, 4L, 4L, 3L, 10L, 8L, 6L, 4L, 4L,
12L), DataCRMSanoflore.Day_Creation_Sales = c(13L, 11L, 15L,
2L, 31L, 26L, 23L, 1L, 5L, 2L, 25L, 16L, 10L, 27L, 13L, 7L, 3L,
18L, 9L, 8L, 27L, 7L, 8L, 18L, 18L, 24L, 6L, 2L, 26L, 4L, 4L,
24L, 16L, 17L, 12L, 15L, 22L, 30L, 10L, 5L, 1L, 14L, 2L, 10L,
5L, 5L, 10L, 27L, 25L, 6L), DataCRMSanoflore.Year_Validation_Sales = c(2L,
1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L,
1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), DataCRMSanoflore.Month_Validation_Sales = c(9L, 9L, 2L,
10L, 11L, 10L, 7L, 9L, 12L, 9L, 7L, 12L, 3L, 12L, 4L, 2L, 6L,
3L, 6L, 10L, 4L, 7L, 6L, 3L, 5L, 8L, 3L, 1L, 10L, 7L, 4L, 11L,
11L, 9L, 5L, 4L, 9L, 4L, 2L, 12L, 10L, 4L, 4L, 3L, 10L, 8L, 6L,
4L, 4L, 12L), DataCRMSanoflore.Day_Validation_Sales = c(15L,
14L, 16L, 3L, 3L, 1L, 27L, 2L, 6L, 5L, 27L, 21L, 19L, 1L, 27L,
8L, 5L, 21L, 10L, 9L, 30L, 9L, 9L, 21L, 26L, 27L, 7L, 4L, 1L,
6L, 15L, 25L, 17L, 18L, 13L, 20L, 29L, 1L, 11L, 7L, 2L, 16L,
3L, 20L, 6L, 6L, 13L, 29L, 29L, 8L), TYPE_PEAU = c(3L, 4L, 5L,
1L, 3L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 4L, 3L, 1L, 3L, 1L, 3L, 3L,
3L, 1L, 1L, 1L, 3L, 1L, 1L, 3L, 1L, 3L, 5L, 1L, 5L, 2L, 1L, 5L,
5L, 3L, 1L, 3L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 3L, 1L),
SENSIBILITE = c(4L, 4L, 4L, 1L, 3L, 1L, 1L, 1L, 2L, 1L, 1L,
1L, 4L, 4L, 1L, 3L, 1L, 3L, 3L, 4L, 1L, 1L, 1L, 2L, 1L, 1L,
4L, 1L, 2L, 3L, 1L, 4L, 4L, 1L, 3L, 4L, 4L, 1L, 4L, 1L, 1L,
1L, 1L, 4L, 1L, 1L, 1L, 1L, 4L, 1L), IMPERFECTIONS = c(3L,
4L, 3L, 1L, 2L, 1L, 1L, 1L, 4L, 1L, 1L, 1L, 3L, 3L, 1L, 2L,
1L, 3L, 2L, 3L, 1L, 1L, 1L, 4L, 1L, 1L, 3L, 1L, 3L, 2L, 1L,
4L, 3L, 1L, 3L, 3L, 3L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L,
1L, 1L, 3L, 1L), BRILLANCE = c(4L, 2L, 2L, 1L, 4L, 1L, 1L,
1L, 4L, 1L, 1L, 1L, 4L, 4L, 1L, 4L, 1L, 4L, 4L, 4L, 1L, 1L,
1L, 4L, 1L, 1L, 4L, 1L, 4L, 4L, 1L, 2L, 3L, 1L, 4L, 4L, 4L,
1L, 4L, 1L, 1L, 1L, 1L, 4L, 1L, 1L, 1L, 1L, 4L, 1L), GRAIN_PEAU = c(4L,
4L, 4L, 1L, 4L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 4L, 2L, 1L, 2L,
1L, 2L, 4L, 4L, 1L, 1L, 1L, 4L, 1L, 1L, 3L, 1L, 2L, 2L, 1L,
3L, 2L, 1L, 2L, 4L, 4L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L,
1L, 1L, 2L, 1L), RIDES_VISAGE = c(2L, 2L, 2L, 1L, 4L, 1L,
1L, 1L, 4L, 1L, 1L, 1L, 4L, 4L, 1L, 2L, 1L, 4L, 2L, 4L, 1L,
1L, 1L, 2L, 1L, 1L, 4L, 1L, 4L, 4L, 1L, 4L, 4L, 1L, 2L, 4L,
2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 4L, 1L),
ALLERGIES = c(2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L,
1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L,
2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L), MAINS = c(3L, 4L, 4L,
1L, 4L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 3L, 3L, 1L, 3L, 1L, 3L,
3L, 3L, 1L, 1L, 1L, 4L, 1L, 1L, 4L, 1L, 3L, 2L, 1L, 4L, 4L,
1L, 3L, 4L, 4L, 1L, 3L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L,
3L, 1L), PEAU_CORPS = c(2L, 3L, 3L, 1L, 2L, 1L, 1L, 1L, 2L,
1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 3L, 1L, 1L, 1L, 2L,
1L, 1L, 2L, 1L, 2L, 2L, 1L, 3L, 3L, 1L, 3L, 3L, 2L, 1L, 3L,
1L, 1L, 1L, 1L, 4L, 1L, 1L, 1L, 1L, 3L, 1L), INTERET_ALIM_NATURELLE = c(2L,
4L, 4L, 1L, 2L, 1L, 1L, 1L, 4L, 1L, 1L, 1L, 2L, 2L, 1L, 2L,
1L, 4L, 2L, 2L, 1L, 1L, 1L, 4L, 1L, 1L, 2L, 1L, 2L, 2L, 1L,
3L, 4L, 1L, 4L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L,
1L, 1L, 2L, 1L), INTERET_ORIGINE_GEO = c(2L, 4L, 2L, 1L,
2L, 1L, 1L, 1L, 5L, 1L, 1L, 1L, 2L, 5L, 1L, 2L, 1L, 2L, 5L,
2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 5L, 2L, 1L, 4L, 2L, 1L,
2L, 5L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L,
1L), INTERET_VACANCES = c(3L, 4L, 2L, 1L, 3L, 1L, 1L, 1L,
2L, 1L, 1L, 1L, 3L, 2L, 1L, 2L, 1L, 3L, 4L, 3L, 1L, 1L, 1L,
2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 3L, 3L, 1L, 4L, 3L, 2L, 1L,
2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L), INTERET_ENVIRONNEMENT = c(3L,
5L, 5L, 1L, 5L, 1L, 1L, 1L, 5L, 1L, 1L, 1L, 3L, 3L, 1L, 3L,
1L, 3L, 3L, 3L, 1L, 1L, 1L, 3L, 1L, 1L, 3L, 1L, 3L, 3L, 1L,
5L, 3L, 1L, 3L, 3L, 3L, 1L, 3L, 1L, 1L, 1L, 1L, 3L, 1L, 1L,
1L, 1L, 3L, 1L), INTERET_COMPOSITION = c(2L, 2L, 2L, 1L,
4L, 1L, 1L, 1L, 4L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L,
2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 3L, 4L, 1L,
4L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 4L,
1L)), .Names = c("DataCRMSanoflore.Year_Sales", "DataCRMSanoflore.Month_Sales",
"DataCRMSanoflore.Date_Sales", "DataCRMSanoflore.HOURS_INSCR",
"DataCRMSanoflore.Year_Creation_Sales", "DataCRMSanoflore.Month_Creation_Sales",
"DataCRMSanoflore.Day_Creation_Sales", "DataCRMSanoflore.Year_Validation_Sales",
"DataCRMSanoflore.Month_Validation_Sales", "DataCRMSanoflore.Day_Validation_Sales",
"DataCRMSanoflore.AGE_CUSTUMER", "DataCRMSanoflore.MEAN_PURCHASE",
"DataCRMSanoflore.NUMBER_GIFTS", "DataCRMSanoflore.Year_Sales",
"DataCRMSanoflore.Month_Sales", "DataCRMSanoflore.Date_Sales",
"DataCRMSanoflore.Year_Creation_Sales", "DataCRMSanoflore.Month_Creation_Sales",
"DataCRMSanoflore.Day_Creation_Sales", "DataCRMSanoflore.Year_Validation_Sales",
"DataCRMSanoflore.Month_Validation_Sales", "DataCRMSanoflore.Day_Validation_Sales",
"TYPE_PEAU", "SENSIBILITE", "IMPERFECTIONS", "BRILLANCE", "GRAIN_PEAU",
"RIDES_VISAGE", "ALLERGIES", "MAINS", "PEAU_CORPS", "INTERET_ALIM_NATURELLE",
"INTERET_ORIGINE_GEO", "INTERET_VACANCES", "INTERET_ENVIRONNEMENT",
"INTERET_COMPOSITION"), row.names = c(NA, 50L), class = "data.frame")
第一步是编写此代码以创建 PCA 对象,如下所示:
library(FactoMineR)
library("factoextra")
res.pca <- PCA(as.data.frame(DATA_FINAL), graph = FALSE)
然后,为了绘制变量,我使用了如下所示的fviz_pca_var
函数:
fviz_pca_var(res.pca, col.var = "black")
我收到此错误:
row.names<-.data.frame
中的错误(*tmp*
,值 = 值( :不允许
重复的"row.names" 另外: 警告消息: 1:在data.row.names(row.names, rowsi, i(:一些row.names 重复:14,15,16,17,18,19,20,21,22 -->行。未使用的名称 2: 设置"row.names"时的非唯一值: "DataCRMSanoflore.Date_Sales"、"DataCRMSanoflore.Day_Creation_Sales"、" "DataCRMSanoflore.Day_Validation_Sales", "DataCRMSanoflore.Month_Creation_Sales", "DataCRMSanoflore.Month_Sales", "DataCRMSanoflore.Month_Validation_Sales", "DataCRMSanoflore.Year_Creation_Sales"、"DataCRMSanoflore.Year_Sales"、" "DataCRMSanoflore.Year_Validation_Sales">
请问如何解决这个问题?
输入数据中有重复的列,因此只需删除这些列即可完成所有设置。
df <- DATA_FINAL[, -c(1:3, 5:10)]
然后运行PCA
library(FactoMineR)
library(factoextra)
res.pca <- PCA(df, graph = F)
fviz_pca_var(res.pca, col.var = "black")