In statistics, a copula is used as a general way of formulating a multivariate distribution in such a way that various general types of dependence can be represented.[1] Other ways of formulating multivariate distributions include conceptually-based approaches in which the real-world meaning of the variables is used to imply what types of relationships might occur. In contrast, the approach via copulas might be considered as being more raw, but it does allow much more general types of dependencies to be included than would usually be invoked by a conceptual approach.
The approach to formulating a multivariate distribution using a copula is based on the idea that a simple transformation can be made of each marginal variable in such a way that each transformed marginal variable has a uniform distribution. Once this is done, the dependence structure can be expressed as a multivariate distribution on the obtained uniforms, and a copula is precisely a multivariate distribution on marginally uniform random variables. When applied in a practical context, the above transformations might be fitted as an initial step for each margin, or the parameters of the transformations might be fitted jointly with those of the copula.
There are many families of copulas which differ in the detail of the dependence they represent. A family will typically have several parameters which relate to the strength and form of the dependence. Some families of copulas are outlined below. A typical use for copulas is to choose one such family and use it to define the multivariate distribution to be used, typically in fitting a distribution to a sample of data. However, it is possible to derive the copula corresponding to any given multivariate distribution.
Contents[hide] |
Consider two random variables X and Y, with continuous cumulative distribution functions FX and FY. The probability integral transform can be applied separately to the two random variables to define X’ = FX(X) and Y’ = FY(Y). It follows that X’ and Y’ both have uniform distributions but are, in general, dependent. Since the transforms are invertible, specifying the dependence between X and Y is, in a way, the same as specifying dependence between X’ and Y’. With X’ and Y’ being uniform random variables, the problem reduces to specifying a bivariate distribution between two uniforms, that is a copula. So the idea is to simplify the problem by removing consideration of many different marginal distributions by transforming the marginal variates to uniforms, and then specifying dependence as a multivariate distribution on the uniforms.
A copula is a multivariate joint distribution defined on the n-dimensional unit cube [0,1]n such that every marginal distribution is uniform on the interval [0,1].
Specifically, is an n-dimensional copula (briefly, n-copula) if:
where the . is the so called C-volume of B.
The theorem proposed by Sklar [2] underlies most applications of the copula. Sklar's theorem states that given a joint distribution function H for p variables, and respective marginal distribution functions, there exists a copula C such that the copula binds the margins to give the joint distribution.
For the bivariate case, Sklar's theorem can be stated as follows. For any bivariate distribution function H(x,y), let F(x)=H(x,∞) and G(y)=H(∞,y) be the univariate marginal probability distribution functions. Then there exists a copula C such that
(where we have identified the distribution C with its cumulative distribution function). Moreover, if marginal distributions F(x) and G(y) are continuous, the copula function C is unique. Otherwise, the copula C is unique on the range of values of the marginal distributions.
Minimum copula: This is the lower bound for all copulas. In the bivariate case only, it represents perfect negative dependence between variates.
For n-variate copulas, the lower bound is given by
Maximum copula: This is the upper bound for all copulas. It represents perfect positive dependence between variates:
For n-variate copulas, the upper bound is given by
Conclusion: For all copulas C(u,v),
In the multivariate case, the corresponding inequality is
One example of a copula often used for modelling in finance is the Gaussian copula, which is constructed from the bivariate normal distribution via Sklar's theorem. With Φρ being the standard bivariate normal cumulative distribution function with correlation ρ, the Gaussian copula function is
where u and and Φ denotes the standard normal cumulative distribution function.
Differentiating C yields the copula density function:
where
is the density function for the standard bivariate gaussian with Pearson's product moment correlation coefficient ρ and φ is the standard normal density.
Archimedean copulas are an important family of copulas, which have a simple form with properties such as associativity and have a variety of dependence structures. Unlike elliptical copulas (eg. Gaussian), most of the Archimedean copulas have closed-form solutions and are not derived from the multivariate distribution functions using Sklar’s Theorem.
One particularly simple form of a n-dimensional copula is
where Ψ is known as a generator function. Such copulas are known as Archimedean. Any generator function which satisfies the properties below is the basis for a valid copula:
Product copula: Also called the independent copula, this copula has no dependence between variates. Its density function is unity everywhere.
Where the generator function is indexed by a parameter, a whole family of copulas may be Archimedean. For example:
Clayton copula:
For θ = 0 in the Clayton copula, the random variables are statistically independent. The generator function approach can be extended to create multivariate copulas, by simply including more additive terms.
Gumbel copula:
Frank copula:
Aurélien Alfonsi and Damiano Brigo (2005)[3] introduced new families of copulas based on periodic functions. They noticed that if ƒ is a 1-periodic non-negative function that integrates to 1 over [0,1] and F is a double primitive of ƒ, then both
are copula functions, the second one not necessarily exchangeable. This may be a tool to introduce asymmetric dependence, which is absent in most known copula functions.
When analysing data with an unknown underlying distribution, one can transform the empirical data distribution into an "empirical copula" by warping such that the marginal distributions become uniform[1]. Mathematically the empirical copula frequency function is calculated by
where x(i) represents the ith order statistic of x.
Less formally, simply replace the data along each dimension with the data ranks divided by n.
Copulas are used in the pricing of collateralized debt obligations [4] (CDOs). Dependence modelling with copula functions is widely used in applications of financial risk assessment and actuarial analysis. Recently they have been successfully applied to the database formulation for the reliability analysis of Highway bridges and to various multivariate simulation studies in Civil, Mechanical and Offshore engineering.[citation needed]. The methodology of applying the Gaussian copula to credit derivatives as developed by David X. Li is said to be the reason behind the global financial crisis of 2008–2009.[5]