data:image/s3,"s3://crabby-images/f9aa0/f9aa0aa37162be7b944506991f6518f755cd889e" alt="Learning Bayesian Models with R"
Marginal distribution
In many situations, we are interested only in the probability distribution of a subset of random variables. For example, in the heart disease problem mentioned in the previous section, if we want to infer the probability of people in a population having a heart disease as a function of their age only, we need to integrate out the effect of other random variables such as blood pressure and diabetes. This is called marginalization:
data:image/s3,"s3://crabby-images/a4edb/a4edb2bb945b1b476ec36de3f076f1a6cf4f9858" alt="Marginal distribution"
Or:
data:image/s3,"s3://crabby-images/a7d37/a7d37163bc505bc0995ba0dbfc11bd2d5d1e5199" alt="Marginal distribution"
Note that marginal distribution is very different from conditional distribution. In conditional probability, we are finding the probability of a subset of random variables with values of other random variables fixed (conditioned) at a given value. In the case of marginal distribution, we are eliminating the effect of a subset of random variables by integrating them out (in the sense averaging their effect) from the joint distribution. For example, in the case of two-dimensional normal distribution, marginalization with respect to one variable will result in a one-dimensional normal distribution of the other variable, as follows:
data:image/s3,"s3://crabby-images/c0c43/c0c43b00a7b3a77e778738bfc69e581f5c4156b0" alt="Marginal distribution"
The details of this integration is given as an exercise (exercise 3 in the Exercises section of this chapter).