Dirichlet Distribution& Process Notes(笔记)


1. Dirichlet 概率分布 是 概率分布的分布。its support 是simplex,即Dirichlet 概率密度函数的定义域是simplex上的一个点。Dirichlet Distribution 描述了simplex上每个点的概率,如下图所示

                    Figure 1 Dirichlet Distribution (From wiki)

2. Dirichlet 分布的参数alpha_i表示pesudo counts,即category distribution中选中第i个component的次数。由于

E(theta_i)=alpha_i/sigma{alpha_i}; 所以alpha_i数值大的话,表示选中的theta_i的期望概率要高。


3. Dirichlet Distribution中各个点的概率由parameter  alpha 决定。通常我们使用symmetric的dirichlet,i.e. alpha的

各个component都一样,

from wiki

When [1], the symmetric Dirichlet distribution is equivalent to a uniform distribution over the open standard -simplex, i.e. it is uniform over all points in its support. Values of the concentration parameter above 1 prefer variates that are dense, evenly distributed distributions, i.e. all the values within a single sample are similar to each other. Values of the concentration parameter below 1 prefer sparse distributions, i.e. most of the values within a single sample will be close to 0, and the vast majority of the mass will be concentrated in a few of the values.


Dirichlet Distribution& Process Notes(笔记)_第1张图片

                                                            Fig.2


4. Dirichlet 的simplex的维度如果趋于infinity, 生成的sample 就服从dirichlet process. i.e. process 是distribution在infinity上的扩展。


The Dirichlet process is the infinite-dimensional generalization of the Dirichlet distribution.


Ref: 

http://www.cs.cmu.edu/~epxing/Class/10701-08s/recitation/dirichlet.pdf





你可能感兴趣的:(Dirichlet Distribution& Process Notes(笔记))