In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can be interpreted as providing a relative likelihood that the value of the random variable would equal that sample.[citation needed] In other words, while the absolute likelihood for a continuous random variable to take on any particular value is 0 (since there are an infinite set of possible values to begin with), the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would equal one sample compared to the other sample.
在概率论中,概率密度函数(pdf)或连续随机变量的密度是一个函数,其在样本空间中的任何给定样本(或点)的值(由随机变量获取的一组可能值)可以解释为提供了一个相对可能性,即随机变量的值等于样本。换句话说,虽然连续随机变量接受任何特定值的绝对可能性为0(因为有无限多的可能值集要开始),但可以使用两个不同样本的pdf值来推断随机变量在任何特定绘图中的可能性有多大。随机变量等于一个样本与另一个样本。
In a more precise sense, the PDF is used to specify the probability of the random variable falling within a particular range of values, as opposed to taking on any one value. This probability is given by the integral of this variable’s PDF over that range—that is, it is given by the area under the density function but above the horizontal axis and between the lowest and greatest values of the range. The probability density function is nonnegative everywhere, and its integral over the entire space is equal to one.
更精确地说,PDF用于指定随机变量落在特定值范围内的概率,而不是接受任何一个值。这个概率由该变量的PDF在该范围内的积分给出,也就是说,它由密度函数下但在水平轴上以及范围的最小值和最大值之间的面积给出。概率密度函数处处都是非负的,它在整个空间上的积分等于一。
The terms "probability distribution function"[2] and "probability function"[3] have also sometimes been used to denote the probability density function. However, this use is not standard among probabilists and statisticians. In other sources, "probability distribution function" may be used when the probability distribution is defined as a function over general sets of values, or it may refer to the cumulative distribution function, or it may be a probability mass function (PMF) rather than the density. "Density function" itself is also used for the probability mass function, leading to further confusion.[4] In general though, the PMF is used in the context of discrete random variables (random variables that take values on a discrete set), while PDF is used in the context of continuous random variables.
术语“概率分布函数”[2]和“概率函数”[3]有时也被用来表示概率密度函数。然而,这种用法在概率学家和统计学家中并不标准。在其他来源中,“概率分布函数”可用于将概率分布定义为一般值集合上的函数,或指累积分布函数,或可能是概率质量函数(PMF),而不是密度。“密度函数”本身也用于概率质量函数,这导致了进一步的混淆[4]一般来说,PMF用于离散随机变量(在离散集上取值的随机变量)的上下文,而PDF用于连续随机变量的上下文。
Suppose a species of bacteria typically lives 4 to 6 hours. What is the probability that a bacterium lives exactly 5 hours? The answer is 0%. A lot of bacteria live for approximately 5 hours, but there is no chance that any given bacterium dies at exactly 5.0000000000... hours.
Instead one might ask: What is the probability that the bacterium dies between 5 hours and 5.01 hours? Suppose the answer is 0.02 (i.e., 2%). Next: What is the probability that the bacterium dies between 5 hours and 5.001 hours? The answer should be about 0.002, since this time interval is one-tenth as long as the previous. The probability that the bacterium dies between 5 hours and 5.0001 hours should be about 0.0002, and so on.
In these three examples, the ratio (probability of dying during an interval) / (duration of the interval) is approximately constant, and equal to 2 per hour (or 2 hour−1). For example, there is 0.02 probability of dying in the 0.01-hour interval between 5 and 5.01 hours, and (0.02 probability / 0.01 hours) = 2 hour−1. This quantity 2 hour−1 is called the probability density for dying at around 5 hours.
Therefore, in response to the question "What is the probability that the bacterium dies at 5 hours?", a literally correct but unhelpful answer is "0", but a better answer can be written as (2 hour−1) dt. This is the probability that the bacterium dies within a small (infinitesimal) window of time around 5 hours, where dt is the duration of this window.
For example, the probability that it lives longer than 5 hours, but shorter than (5 hours + 1 nanosecond), is (2 hour−1)×(1 nanosecond) ≃ 6×10−13 (using the unit conversion 3.6×1012 nanoseconds = 1 hour).
There is a probability density function f with f(5 hours) = 2 hour−1. The integral of f over any window of time (not only infinitesimal windows but also large windows) is the probability that the bacterium dies in that window.
假设一种细菌通常活4到6个小时。细菌存活5小时的概率是多少?答案是0%。很多细菌活了大约5个小时,但没有任何一种细菌在5亿左右死亡的可能性…小时。
相反,人们可能会问:细菌在5小时到5.01小时之间死亡的概率是多少?假设答案是0.02(即2%)。下一步:细菌在5小时到5.001小时之间死亡的概率是多少?答案应该是0.002左右,因为这个时间间隔是前一个时间间隔的十分之一。细菌在5小时到5.0001小时之间死亡的概率大约为0.0002,以此类推。
在这三个例子中,比率(在间隔期间死亡的概率)/(间隔的持续时间)约为常数,等于每小时2次(或2小时−1)。例如,在5到5.01小时之间的0.01小时间隔内死亡的概率为0.02,(0.02概率/0.01小时)=2小时−1。这个数量2小时-1被称为5小时左右死亡的概率密度。
因此,在回答“细菌在5小时内死亡的概率是多少?”,一个字面上正确但无用的答案是“0”,但一个更好的答案可以写成(2小时-1)dt。这是细菌在大约5小时的一个小(极小)时间窗口内死亡的概率,其中dt是该窗口的持续时间。
例如,寿命超过5小时但短于(5小时+1纳秒)的概率是(2小时−1)×(1纳秒)6×10−13(使用单位转换3.6×1012纳秒=1小时)。
存在概率密度函数f,其中f(5小时)=2小时−1。在任何时间窗口(不仅是极小的窗口,而且是较大的窗口)上,F的积分是细菌在该窗口中死亡的概率。