1. Monte Carlo 模拟:
特点:
随机抽样
独立
计算量大
相对于historic simulation method(nonparametric method)更精确
可用于任何概率分布,任何情况
generate the data according to the desired data generating process(DGP)
Sampling error:标准误
Sx=根号下 [Var(X)/ N]
N代表重复次数the number of replication
置信区间:[x拔 - z*Sx,x拔 + z*Sx]
short option → extreme loss → Var(x)↑ → Sx ↑
降低标准误的方法:
① N ↑:
但是N有可能高到无法令人接受
②对偶变量法:
antithetic variates:
取补集complement,平行模拟 parallel simulation
x拔=(x1+x2)/ 2
Var(x拔)= [ var(x1)+ var(x2)+ 2Cov(x1,x2)] / 4
不用对偶变量法时:
因为x1,x2独立,所以Cov=0
即Var(x拔)= [ var(x1)+ var(x2)] / 4
用对偶变量法时:
rou(x1,x2) < 0
Cov(x1,x2)<0
Var ’(x拔)< Var(x拔)
③控制变量法:
control variates:
control variable have a high correlation (similar to that used in the simulation),but which properties are known prior to the simulation.
add values that are mean 0.
x* = y +(x^ - y^)
Var(x*)= Var(x^)+ Var(y^)- 2Cov(x^,y^)
要降低抽样误差,即Var(x*)< Var(x^)
即Var(y^)- 2Cov(x^,y^)< 0
2. Boostraping 方法:重抽样方法
特点:
generate simulated data
sampling repeatedly
和simulation一样都充分利用了observed data
区别:
Monte Carlo:
使用observed data来估计key model parameters,比如mean、standard deviation。并且作出关于它们分布的假设an assumption of the distribution。
Bootstrapping :
直接使用observed data 来模拟出一个有相似特征的sample 。不需要模拟 observed data,也不需要作出关于它们分布的假设。
One Simple Fact:
反复抽样的分布来源于已观察的数据
方法:
①iid method:
observations互相独立,无自相关性
②Circular Block Bootstrap:(CCB)
financial data 有自相关性
block size = 根号下 sample size
Limitations:
过去不能反映未来(太过依赖历史数据)
bootstrapping不能产生在样本中没有出现的数据
bootstrapping和simulation都会遭受“黑天鹅”(“Black Swan” problem)事件
(a good statistical model:应该允许the probability of future losses 大于 have realized in the past)
Ineffective Situations:
①outliers的发生和发生频率
②bootstrapping假设数据互相是独立的,而实际上这明显不成立。因为数据有自相关性。
3. Random Number Generation:
类型:
①truly random number真随机数:
time consuming and difficult
②pseudorandom number伪随机数:
computer-generated random number are in fact not random at all
formula公式/算法
最简单的类型:
从uniform distribution N(0,1)中抽取
equal chance
以初始值initial value(seed)开始
解决办法:
初始值会影响the characteristics of generation distribution,开始影响很大,最后消失
generate more random number than required and discard previous observations
Random number reusage:
优点:reduce the variability of the difference in the estimates across experiments.
缺点:
①不会提高数据的精确性
②不太可能节省大量的时间
Disadvantages of simulation:
①large replications,expensive computationally
②DGP(data generating process)的不切实际的假设,导致模拟结果less precise
③results hard to replicate
④results are experiment-specific