Medium:How to check the correctness of the AB test?

有以下两种错误:

Medium:How to check the correctness of the AB test?_第1张图片

通常,type 1 error is more important!因此我们type 2 error就是在“委曲求全”:

The probability of Type II error can be adjusted to the desired value by changing the size of the groups or by reducing the variance in the data.

定律:The larger the group size, the lower the variance, the smaller the probability of Type II error. 

Formula:

Medium:How to check the correctness of the AB test?_第2张图片

接下来进入checking for correctness的具体步骤:

Medium:How to check the correctness of the AB test?_第3张图片

Estimate the required group size:(code如下)

Medium:How to check the correctness of the AB test?_第4张图片

By conducting 1000 experiments and calculating the proportion of type II errors, we obtain a point estimate of the probability of type II error.

Then, using numerical synthetic A/A and A/B experiments, we will estimate error probabilities and construct confidence intervals.

Medium:How to check the correctness of the AB test?_第5张图片

根据输出结果:Estimates of error probabilities are approximately equal to 0.1 and 0.2, as they should be. Everything is correct, the Student’s test on this data works correctly.

接下来我们看另一个指标:Distribution of p-values,定义如下:

Medium:How to check the correctness of the AB test?_第6张图片

任何significance level都应该遵循上图和以下的情况:

Medium:How to check the correctness of the AB test?_第7张图片

Medium:How to check the correctness of the AB test?_第8张图片

Answer:NO NO NO NO NO!

每次都需要做test,比如这个数据跑出来就有问题!

Medium:How to check the correctness of the AB test?_第9张图片

We obtained an estimate of the probability of type I error of about 0.25, which is much higher than the significance level of 0.1. The graph shows that the distribution of p-values for synthetic A/A tests is not uniform and deviates from the diagonal. In this example, the Student’s t-test is incorrect because the data are dependent (the costs of purchases by one person are dependent). If we had not immediately realized the dependence of the data, the estimation of error probabilities would have helped us understand that such a test is incorrect.

最终的大总结:(acceptable probability & p-value)

Medium:How to check the correctness of the AB test?_第10张图片

你可能感兴趣的:(Medium:,AB,test专题,ab测试)