python进行方差分析
Previously I have shown how to analyze data collected using within-subjects designs using rpy2 (i.e., R from within Python) and Pyvttbl. In this post I will extend it into a factorial ANOVA using Python (i.e., Pyvttbl). In fact, we are going to carry out a Two-way ANOVA but the same method will enable you to analyze any factorial design. I start with importing the Python libraries that are going to be use.
之前,我已经展示了如何使用rpy2 (即Python中的R)和Pyvttbl分析使用对象内设计收集的数据。 在本文中,我将使用Python(即Pyvttbl )将其扩展为阶乘方差分析。 实际上,我们将进行双向方差分析,但相同的方法将使您能够分析任何因子设计。 我首先导入将要使用的Python库。
import numpy as np import pyvttbl as pt from collections import namedtuple
import numpy as np import pyvttbl as pt from collections import namedtuple
Numpy is be used in simulating the data. I create a data set in which we have one factor of two levels (P) and a second factor of 3 levels (Q). As in many of my examples the dependent variable is going to be response time (rt) and we create a list of lists for the different population means we are going to assume (i.e., the variable ‘values’). I was a bit lazy when coming up with the data so I named the independent variables ‘iv1’ and ‘iv2’. However, you could think of iv1 as two different memory tasks; verbal and spatial memory. Iv2 could be different levels of distractions (no distraction, synthetic sounds, and speech, for instance).
Numpy用于模拟数据。 我创建了一个数据集,其中我们有一个因子为两个级别(P),第二个因子为3个级别(Q)。 正如在我的许多示例中一样,因变量将是响应时间(rt),并且我们将为不同的人口创建一个列表列表,这意味着我们要假设(即变量“值”)。 当我想出数据时我有点懒,所以我将自变量命名为“ iv1”和“ iv2”。 但是,您可以将iv1视为两个不同的内存任务; 语言和空间记忆。 Iv2可能是不同程度的干扰(例如,没有干扰,合成声音和语音)。
I start with a boxplot using the method boxplot from Pyvttbl. As far as I can see there is not much room for changing the plot around. We get this plot and it is really not that beautiful.
我首先使用Pyvttbl的boxplot方法创建一个boxplot。 据我所知,没有多少空间可以改变周围的情节。 我们得到了这个情节,它确实不是那么美丽。
Boxplot Pyvttbl 箱线图df.box_plot('rt', factors=['iv1', 'iv2'])
df.box_plot('rt', factors=['iv1', 'iv2'])
To run the Two-Way ANOVA is simple; the first argument is the dependent variable, the second the subject identifier, and than the within-subject factors. In two previous posts I showed how to carry out one-way and two-way ANOVA for independent measures. One could, of course combine these techniques, to do a split-plot/mixed ANOVA by adding an argument ‘bfactors’ for the between-subject factor(s).
运行双向方差分析很简单; 第一个参数是因变量,第二个参数是主题标识符,而不是主题内因素。 在前两篇文章中,我展示了如何对独立措施进行单向和双向方差分析。 当然,可以通过为对象间因素添加自变量“ bfactors”来组合这些技术,以进行分割图/混合方差分析。
The output one get from this is an ANOVA table. In this table all metrics needed plus some more can be found; F-statistic, p-value, mean square errors, confidence intervals, effect size (i.e., eta-squared) for all factors and the interaction. Also, some corrected degree of freedom and mean square error can be found (e.g., Grenhouse-Geisser corrected). The output is in the end of the post. It is a bit hard to read. If you know any other way to do a repeated measures ANOVA using Python please let me know. Also, if you happen to know that you can create nicer plots with Pyvttbl I would also like to know how! Please leave a comment.
从中得到的输出是ANOVA表。 在此表中,可以找到所有需要的指标以及更多指标。 所有因素的F统计量,p值,均方误差,置信区间,效应大小(即eta平方)和相互作用。 同样,可以找到某种校正后的自由度和均方误差(例如,Grenhouse-Geisser校正)。 输出在帖子末尾。 有点难读。 如果您知道使用Python执行重复测量方差分析的其他方法,请告诉我。 另外,如果您碰巧知道可以使用Pyvttbl创建更好的图,我也想知道如何! 请发表评论。
rt ~ iv1 * iv2
TESTS OF WITHIN SUBJECTS EFFECTS
Measure: rt
Source Type III eps df MS F Sig. et2_G Obs. SE 95% CI lambda Obs.
SS Power
=======================================================================================================================================================
iv1 Sphericity Assumed 4419957.211 - 1 4419957.211 324.248 2.128e-13 3.295 60 16.096 31.548 1023.941 1
Greenhouse-Geisser 4419957.211 1 1 4419957.211 324.248 2.128e-13 3.295 60 16.096 31.548 1023.941 1
Huynh-Feldt 4419957.211 1 1 4419957.211 324.248 2.128e-13 3.295 60 16.096 31.548 1023.941 1
Box 4419957.211 1 1 4419957.211 324.248 2.128e-13 3.295 60 16.096 31.548 1023.941 1
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv1) Sphericity Assumed 258996.722 - 19 13631.406
Greenhouse-Geisser 258996.722 1 19 13631.406
Huynh-Feldt 258996.722 1 19 13631.406
Box 258996.722 1 19 13631.406
-------------------------------------------------------------------------------------------------------------------------------------------------------
iv2 Sphericity Assumed 5257766.564 - 2 2628883.282 206.008 4.023e-21 3.920 40 18.448 36.158 433.701 1
Greenhouse-Geisser 5257766.564 0.550 1.101 4777252.692 206.008 1.320e-12 3.920 40 18.448 36.158 433.701 1
Huynh-Feldt 5257766.564 0.550 1.101 4777252.692 206.008 1.320e-12 3.920 40 18.448 36.158 433.701 1
Box 5257766.564 0.500 1 5257766.564 206.008 1.192e-11 3.920 40 18.448 36.158 433.701 1
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv2) Sphericity Assumed 484921.251 - 38 12761.086
Greenhouse-Geisser 484921.251 0.550 20.911 23189.668
Huynh-Feldt 484921.251 0.550 20.911 23189.668
Box 484921.251 0.500 19 25522.171
-------------------------------------------------------------------------------------------------------------------------------------------------------
iv1 * Sphericity Assumed 1622027.598 - 2 811013.799 83.220 1.304e-14 1.209 20 22.799 44.687 87.600 1.000
iv2 Greenhouse-Geisser 1622027.598 0.545 1.091 1486817.582 83.220 6.085e-09 1.209 20 22.799 44.687 87.600 1.000
Huynh-Feldt 1622027.598 0.545 1.091 1486817.582 83.220 6.085e-09 1.209 20 22.799 44.687 87.600 1.000
Box 1622027.598 0.500 1 1622027.598 83.220 2.262e-08 1.209 20 22.799 44.687 87.600 1.000
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv1 * Sphericity Assumed 370327.311 - 38 9745.456
iv2) Greenhouse-Geisser 370327.311 0.545 20.728 17866.175
Huynh-Feldt 370327.311 0.545 20.728 17866.175
Box 370327.311 0.500 19 19490.911
TABLES OF ESTIMATED MARGINAL MEANS
Estimated Marginal Means for iv1
iv1 Mean Std. Error 95% Lower Bound 95% Upper Bound
==============================================================
1 983.755 43.162 899.157 1068.354
2 599.917 21.432 557.909 641.925
Estimated Marginal Means for iv2
iv2 Mean Std. Error 95% Lower Bound 95% Upper Bound
===============================================================
1 525.025 19.324 487.150 562.899
2 814.197 49.416 717.342 911.053
3 1036.286 43.789 950.459 1122.114
Estimated Marginal Means for iv1 * iv2
iv1 iv2 Mean Std. Error 95% Lower Bound 95% Upper Bound
=====================================================================
1 1 553.522 24.212 506.066 600.978
1 2 1103.488 28.411 1047.804 1159.173
1 3 1294.256 19.773 1255.501 1333.011
2 1 496.528 29.346 439.009 554.047
2 2 524.906 20.207 485.301 564.512
2 3 778.317 21.815 735.560 821.073
rt ~ iv1 * iv2
TESTS OF WITHIN SUBJECTS EFFECTS
Measure: rt
Source Type III eps df MS F Sig. et2_G Obs. SE 95% CI lambda Obs.
SS Power
=======================================================================================================================================================
iv1 Sphericity Assumed 4419957.211 - 1 4419957.211 324.248 2.128e-13 3.295 60 16.096 31.548 1023.941 1
Greenhouse-Geisser 4419957.211 1 1 4419957.211 324.248 2.128e-13 3.295 60 16.096 31.548 1023.941 1
Huynh-Feldt 4419957.211 1 1 4419957.211 324.248 2.128e-13 3.295 60 16.096 31.548 1023.941 1
Box 4419957.211 1 1 4419957.211 324.248 2.128e-13 3.295 60 16.096 31.548 1023.941 1
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv1) Sphericity Assumed 258996.722 - 19 13631.406
Greenhouse-Geisser 258996.722 1 19 13631.406
Huynh-Feldt 258996.722 1 19 13631.406
Box 258996.722 1 19 13631.406
-------------------------------------------------------------------------------------------------------------------------------------------------------
iv2 Sphericity Assumed 5257766.564 - 2 2628883.282 206.008 4.023e-21 3.920 40 18.448 36.158 433.701 1
Greenhouse-Geisser 5257766.564 0.550 1.101 4777252.692 206.008 1.320e-12 3.920 40 18.448 36.158 433.701 1
Huynh-Feldt 5257766.564 0.550 1.101 4777252.692 206.008 1.320e-12 3.920 40 18.448 36.158 433.701 1
Box 5257766.564 0.500 1 5257766.564 206.008 1.192e-11 3.920 40 18.448 36.158 433.701 1
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv2) Sphericity Assumed 484921.251 - 38 12761.086
Greenhouse-Geisser 484921.251 0.550 20.911 23189.668
Huynh-Feldt 484921.251 0.550 20.911 23189.668
Box 484921.251 0.500 19 25522.171
-------------------------------------------------------------------------------------------------------------------------------------------------------
iv1 * Sphericity Assumed 1622027.598 - 2 811013.799 83.220 1.304e-14 1.209 20 22.799 44.687 87.600 1.000
iv2 Greenhouse-Geisser 1622027.598 0.545 1.091 1486817.582 83.220 6.085e-09 1.209 20 22.799 44.687 87.600 1.000
Huynh-Feldt 1622027.598 0.545 1.091 1486817.582 83.220 6.085e-09 1.209 20 22.799 44.687 87.600 1.000
Box 1622027.598 0.500 1 1622027.598 83.220 2.262e-08 1.209 20 22.799 44.687 87.600 1.000
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv1 * Sphericity Assumed 370327.311 - 38 9745.456
iv2) Greenhouse-Geisser 370327.311 0.545 20.728 17866.175
Huynh-Feldt 370327.311 0.545 20.728 17866.175
Box 370327.311 0.500 19 19490.911
TABLES OF ESTIMATED MARGINAL MEANS
Estimated Marginal Means for iv1
iv1 Mean Std. Error 95% Lower Bound 95% Upper Bound
==============================================================
1 983.755 43.162 899.157 1068.354
2 599.917 21.432 557.909 641.925
Estimated Marginal Means for iv2
iv2 Mean Std. Error 95% Lower Bound 95% Upper Bound
===============================================================
1 525.025 19.324 487.150 562.899
2 814.197 49.416 717.342 911.053
3 1036.286 43.789 950.459 1122.114
Estimated Marginal Means for iv1 * iv2
iv1 iv2 Mean Std. Error 95% Lower Bound 95% Upper Bound
=====================================================================
1 1 553.522 24.212 506.066 600.978
1 2 1103.488 28.411 1047.804 1159.173
1 3 1294.256 19.773 1255.501 1333.011
2 1 496.528 29.346 439.009 554.047
2 2 524.906 20.207 485.301 564.512
2 3 778.317 21.815 735.560 821.073
翻译自: https://www.pybloggers.com/2016/03/two-way-anova-for-repeated-measures-using-python/
python进行方差分析