因子分析的主要用途
1减少分析变量个数
2 通过对变量间相关关系的探测,将原始变量分组,即将相关性高的变量分为一组,用
共性因子来代替该变量
3使问题背后的业务因素的意义更加清晰呈现
解释:
使能解释某一因素的相关性很高的变量分为一组(例如文课因子,理科因子),例如某一因子,其中的文科相关的变量前面的载荷因子很大,那么这些变量可以归结为一个因子即文学因子,有的时候一个因子里面的变量的载荷系数分布比较均匀差异不大,我们可以对载荷因子乘上正交矩阵,实现旋转,原因是模型很稳定。
FL | APP | AA | LA | SC | LC | HON | SMS | EXP | DRV | AMB | GSP | POT | KJ | SUIT | |
6 | 7 | 2 | 5 | 8 | 7 | 8 | 8 | 3 | 8 | 9 | 7 | 5 | 7 | 10 | |
9 | 10 | 5 | 8 | 10 | 9 | 9 | 10 | 5 | 9 | 9 | 8 | 8 | 8 | 10 | |
7 | 8 | 3 | 6 | 9 | 8 | 9 | 7 | 4 | 9 | 9 | 8 | 6 | 8 | 10 | |
5 | 6 | 8 | 5 | 6 | 5 | 9 | 2 | 8 | 4 | 5 | 8 | 7 | 6 | 5 | |
6 | 8 | 8 | 8 | 4 | 4 | 9 | 5 | 8 | 5 | 5 | 8 | 8 | 7 | 7 | |
7 | 7 | 7 | 6 | 8 | 7 | 10 | 5 | 9 | 6 | 5 | 8 | 6 | 6 | 6 | |
9 | 9 | 8 | 8 | 8 | 8 | 8 | 8 | 10 | 8 | 10 | 8 | 9 | 8 | 10 | |
9 | 9 | 9 | 8 | 9 | 9 | 8 | 8 | 10 | 9 | 10 | 9 | 9 | 9 | 10 | |
9 | 9 | 7 | 8 | 8 | 8 | 8 | 5 | 9 | 8 | 9 | 8 | 8 | 8 | 10 | |
4 | 7 | 10 | 2 | 10 | 10 | 7 | 10 | 3 | 10 | 10 | 10 | 9 | 3 | 10 | |
4 | 7 | 10 | 0 | 10 | 8 | 3 | 9 | 5 | 9 | 10 | 8 | 10 | 2 | 5 | |
4 | 7 | 10 | 4 | 10 | 10 | 7 | 8 | 2 | 8 | 8 | 10 | 10 | 3 | 7 | |
6 | 9 | 8 | 10 | 5 | 4 | 9 | 4 | 4 | 4 | 5 | 4 | 7 | 6 | 8 | |
8 | 9 | 8 | 9 | 6 | 3 | 8 | 2 | 5 | 2 | 6 | 6 | 7 | 5 | 6 | |
4 | 8 | 8 | 7 | 5 | 4 | 10 | 2 | 7 | 5 | 3 | 6 | 6 | 4 | 6 | |
6 | 9 | 6 | 7 | 8 | 9 | 8 | 9 | 8 | 8 | 7 | 6 | 8 | 6 | 10 | |
8 | 7 | 7 | 7 | 9 | 5 | 8 | 6 | 6 | 7 | 8 | 6 | 6 | 7 | 8 | |
6 | 8 | 8 | 4 | 8 | 8 | 6 | 4 | 3 | 3 | 6 | 7 | 2 | 6 | 4 | |
6 | 7 | 8 | 4 | 7 | 8 | 5 | 4 | 4 | 2 | 6 | 8 | 3 | 5 | 4 | |
4 | 8 | 7 | 8 | 8 | 9 | 10 | 5 | 2 | 6 | 7 | 9 | 8 | 8 | 9 | |
3 | 8 | 6 | 8 | 8 | 8 | 10 | 5 | 3 | 6 | 7 | 8 | 8 | 5 | 8 | |
9 | 8 | 7 | 8 | 9 | 10 | 10 | 10 | 3 | 10 | 8 | 10 | 8 | 10 | 8 | |
7 | 10 | 7 | 9 | 9 | 9 | 10 | 10 | 3 | 9 | 9 | 10 | 9 | 10 | 8 | |
9 | 8 | 7 | 10 | 8 | 10 | 10 | 10 | 2 | 9 | 7 | 9 | 9 | 10 | 8 | |
6 | 9 | 7 | 7 | 4 | 5 | 9 | 3 | 2 | 4 | 4 | 4 | 4 | 5 | 4 | |
7 | 8 | 7 | 8 | 5 | 4 | 8 | 2 | 3 | 4 | 5 | 6 | 5 | 5 | 6 | |
2 | 10 | 7 | 9 | 8 | 9 | 10 | 5 | 3 | 5 | 6 | 7 | 6 | 4 | 5 | |
6 | 3 | 5 | 3 | 5 | 3 | 5 | 0 | 0 | 3 | 3 | 0 | 0 | 5 | 0 | |
4 | 3 | 4 | 3 | 3 | 0 | 0 | 0 | 0 | 4 | 4 | 0 | 0 | 5 | 0 | |
4 | 6 | 5 | 6 | 9 | 4 | 10 | 3 | 1 | 3 | 3 | 2 | 2 | 7 | 3 | |
5 | 5 | 4 | 7 | 8 | 4 | 10 | 3 | 2 | 5 | 5 | 3 | 4 | 8 | 3 | |
3 | 3 | 5 | 7 | 7 | 9 | 10 | 3 | 2 | 5 | 3 | 7 | 5 | 5 | 2 | |
2 | 3 | 5 | 7 | 7 | 9 | 10 | 3 | 2 | 2 | 3 | 6 | 4 | 5 | 2 | |
3 | 4 | 6 | 4 | 3 | 3 | 8 | 1 | 1 | 3 | 3 | 3 | 2 | 5 | 2 | |
6 | 7 | 4 | 3 | 3 | 0 | 9 | 0 | 1 | 0 | 2 | 3 | 1 | 5 | 3 | |
9 | 8 | 5 | 5 | 6 | 6 | 8 | 2 | 2 | 2 | 4 | 5 | 6 | 6 | 3 | |
4 | 9 | 6 | 4 | 10 | 8 | 8 | 9 | 1 | 3 | 9 | 7 | 5 | 3 | 2 | |
4 | 9 | 6 | 6 | 9 | 9 | 7 | 9 | 1 | 2 | 10 | 8 | 5 | 5 | 2 | |
10 | 6 | 9 | 10 | 9 | 10 | 10 | 10 | 10 | 10 | 8 | 10 | 10 | 10 | 10 | |
10 | 6 | 9 | 10 | 9 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | |
10 | 7 | 8 | 0 | 2 | 1 | 2 | 0 | 10 | 2 | 0 | 3 | 0 | 0 | 10 | |
10 | 3 | 8 | 0 | 1 | 1 | 0 | 0 | 10 | 0 | 0 | 0 | 0 | 0 | 10 | |
3 | 4 | 9 | 8 | 2 | 4 | 5 | 3 | 6 | 2 | 1 | 3 | 3 | 3 | 8 | |
7 | 7 | 7 | 6 | 9 | 8 | 8 | 6 | 8 | 8 | 10 | 8 | 8 | 6 | 5 | |
9 | 6 | 10 | 9 | 7 | 7 | 10 | 2 | 1 | 5 | 5 | 7 | 8 | 4 | 5 | |
9 | 8 | 10 | 10 | 7 | 9 | 10 | 3 | 1 | 5 | 7 | 9 | 9 | 4 | 4 | |
0 | 7 | 10 | 3 | 5 | 0 | 10 | 0 | 0 | 2 | 2 | 0 | 0 | 0 | 0 | |
0 | 6 | 10 | 1 | 5 | 0 | 10 | 0 | 0 | 2 | 2 | 0 | 0 | 0 | 0 | |
先做因子分析,分为5个因子
factanal(~.,factors = 5,data = applicant)
Call:
factanal(x = ~., factors = 5, data = applicant)
Uniquenesses:
FL APP AA LA SC LC HON SMS EXP DRV AMB GSP
0.439 0.597 0.509 0.197 0.118 0.005 0.292 0.140 0.365 0.223 0.098 0.119
POT KJ SUIT
0.084 0.005 0.267
Loadings:
Factor1 Factor2 Factor3 Factor4 Factor5
FL 0.127 0.722 0.102 -0.117
APP 0.451 0.134 0.270 0.206 0.258
AA 0.129 0.686
LA 0.222 0.246 0.827
SC 0.917 0.167
LC 0.851 0.125 0.279 -0.420
HON 0.228 -0.220 0.777
SMS 0.880 0.266 0.111
EXP 0.773 0.171
DRV 0.754 0.393 0.199 0.114
AMB 0.909 0.187 0.112 0.165
GSP 0.783 0.295 0.354 0.148 -0.181
POT 0.717 0.362 0.446 0.267
KJ 0.418 0.399 0.563 -0.585
SUIT 0.351 0.764 0.148
Factor1 Factor2 Factor3 Factor4 Factor5
SS loadings 5.490 2.507 2.188 1.028 0.331
Proportion Var 0.366 0.167 0.146 0.069 0.022
Cumulative Var 0.366 0.533 0.679 0.748 0.770
分析因子,找到比较能进行业务解释的主要因子,载荷系数不能很好的进行解释,可以采取对载荷系数进行旋转
Call:
factanal(x = ~., factors = 5, data = applicant)
Uniquenesses:
FL APP AA LA SC LC HON SMS EXP DRV AMB GSP
0.439 0.597 0.509 0.197 0.118 0.005 0.292 0.140 0.365 0.223 0.098 0.119
POT KJ SUIT
0.084 0.005 0.267
Loadings:
Factor1 Factor2 Factor3 Factor4 Factor5
FL 0.127 0.722 0.102 -0.117
APP 0.451 0.134 0.270 0.206 0.258
AA 0.129 0.686
LA 0.222 0.246 0.827
SC 0.917 0.167
LC 0.851 0.125 0.279 -0.420
HON 0.228 -0.220 0.777
SMS 0.880 0.266 0.111
EXP 0.773 0.171
DRV 0.754 0.393 0.199 0.114
AMB 0.909 0.187 0.112 0.165
GSP 0.783 0.295 0.354 0.148 -0.181
POT 0.717 0.362 0.446 0.267
KJ 0.418 0.399 0.563 -0.585
SUIT 0.351 0.764 0.148
Factor1 Factor2 Factor3 Factor4 Factor5
SS loadings 5.490 2.507 2.188 1.028 0.331
Proportion Var 0.366 0.167 0.146 0.069 0.022
Cumulative Var 0.366 0.533 0.679 0.748 0.770
在对原始数据进行转换成有5个因子组成的新变量
> fa <- factanal(~.,factors = 5,data=applicant,scores = "regression")
> fa
Call:
factanal(x = ~., factors = 5, data = applicant, scores = "regression")
Uniquenesses:
FL APP AA LA SC LC HON SMS EXP DRV AMB GSP
0.439 0.597 0.509 0.197 0.118 0.005 0.292 0.140 0.365 0.223 0.098 0.119
POT KJ SUIT
0.084 0.005 0.267
Loadings:
Factor1 Factor2 Factor3 Factor4 Factor5
FL 0.127 0.722 0.102 -0.117
APP 0.451 0.134 0.270 0.206 0.258
AA 0.129 0.686
LA 0.222 0.246 0.827
SC 0.917 0.167
LC 0.851 0.125 0.279 -0.420
HON 0.228 -0.220 0.777
SMS 0.880 0.266 0.111
EXP 0.773 0.171
DRV 0.754 0.393 0.199 0.114
AMB 0.909 0.187 0.112 0.165
GSP 0.783 0.295 0.354 0.148 -0.181
POT 0.717 0.362 0.446 0.267
KJ 0.418 0.399 0.563 -0.585
SUIT 0.351 0.764 0.148
Factor1 Factor2 Factor3 Factor4 Factor5
SS loadings 5.490 2.507 2.188 1.028 0.331
Proportion Var 0.366 0.167 0.146 0.069 0.022
Cumulative Var 0.366 0.533 0.679 0.748 0.770
Test of the hypothesis that 5 factors are sufficient.
The chi square statistic is 60.97 on 40 degrees of freedom.
The p-value is 0.0179
> fa$scores
Factor1 Factor2 Factor3 Factor4 Factor5
1 0.800717544 0.18668478 -0.851460896 -1.02805665 0.52205818
2 1.116241580 0.47700243 0.001629454 -0.43629124 0.36113830
3 0.879369406 0.29478854 -0.314716179 -1.02965924 0.33082062
4 -0.523388290 0.43753019 0.560799973 0.25097714 0.40522224
5 -0.846808386 1.21550502 1.085816718 0.45502930 1.07029291
6 0.003185837 0.27885951 0.243258421 0.12109434 -0.27226717
7 0.703922279 1.33861950 0.111053822 0.01088589 0.64206809
8 0.896108099 1.37342978 0.232713178 -0.35982102 0.35349535
9 0.455395763 1.17038462 0.244111085 -0.19242716 0.17911705
10 1.843009744 -0.18285199 -1.451198021 1.43700462 0.02806712
11 1.781056933 -0.22818096 -2.089052424 1.48488398 0.95136053
12 1.403740004 -0.53727939 -0.605003245 1.66579885 -0.39726150
13 -0.838419356 0.45881416 1.103624446 0.56651271 0.93295036
14 -0.765006924 0.30471946 0.836846379 0.95059097 1.57972186
15 -0.948618470 0.14818660 0.761841309 1.18822261 0.41131508
16 0.670346434 0.62562847 -0.275204781 0.28878032 -0.58926497
17 0.308422895 0.41267618 -0.135936211 -0.42814543 1.56803728
18 0.295571360 -0.46281204 -0.728936475 -1.16500937 -1.31962760
19 0.184298026 -0.21956636 -0.825069511 -0.55179100 -1.51372977
20 0.372855990 0.03579921 1.021768751 -0.31575185 -0.57686133
21 0.402538565 -0.46066903 0.589465103 0.86890942 -0.14399060
22 0.927698958 0.60250660 0.673815357 -1.14036612 -0.33194886
23 0.931887149 0.47998903 0.990087831 -0.83545711 0.59726767
24 0.585842905 0.66244517 1.202381977 -0.86524797 -0.62640797
25 -0.798685118 -0.23749354 0.480395810 0.05836780 -0.34004466
26 -0.781584615 0.15450648 0.518109233 0.43972335 0.54299850
27 0.372094679 -1.12074511 0.595496294 0.95822481 -1.09577912
28 -0.948459790 -0.81527906 -0.922269220 -1.77326821 -0.35738899
29 -1.213604298 -0.13087983 -1.503827903 -1.92660112 1.10293874
30 -0.484101748 -1.21181187 0.367311077 -1.68214807 0.52310776
31 -0.397976947 -0.69999678 0.622508916 -1.63410675 1.01452372
32 -0.173713063 -1.10108335 0.608407659 -0.12745790 -2.26318679
33 -0.256281136 -1.33038515 0.560781422 -0.40338235 -2.53222879
34 -1.183167544 -0.49422381 -0.008024526 -0.81458656 -0.10274921
35 -1.665000629 -0.33809240 0.059885560 -0.88500404 1.19916519
36 -0.514907574 -0.19120917 0.370984444 -0.45888647 -0.61709089
37 1.321163395 -1.72531922 -1.052264142 0.39130820 0.21100177
38 1.237411113 -1.32542774 -0.635392434 -0.30674409 -0.33706339
39 0.687041364 1.55535677 0.960484537 -0.39006072 -0.30682688
40 0.923821287 1.49189358 0.792049416 -0.40120250 0.04739454
41 -1.740662685 1.55241481 -2.301529543 1.10997918 -0.54355251
42 -1.946665869 1.74562267 -2.660094728 0.70599496 -1.13003023
43 -1.601371579 0.70912444 -0.096524877 0.77685896 -1.29579679
44 0.960832458 0.10579350 -0.315491319 0.20738885 0.51793979
45 -0.309142913 -0.38559454 1.071709950 1.49287914 -0.45272845
46 0.155406236 -0.52531271 1.194207841 1.80067575 -0.93031173
47 -1.182605366 -2.06429652 -0.395197469 1.05832124 1.50773379
48 -1.099807701 -2.02977096 -0.694352061 0.86306058 1.47640173
查看各个样本哪些因子比较大