Bayesian inference for the applied statistician (Donald Rubin, 1984)

First, consider the expression “Bayesian inference.” By this I simply mean the method of statistical inference that draws conclusions by calculating conditional distributions of unknown quantities given (a) known quantities and (b) model specifications. Thus, in Bayesian inference, known quantities are treated as observed values of random variables and unknown quantities are treated as unobserved random variables; the conditional distribution of unknowns given knowns follows from applying Bayes’ theorem to the model specifying the joint distribution of known and unknown quantities.

One important point in this last statement is that the plural form of “specifications” is intentional. If more than one model is being entertained, then more than one Bayeisna inference is being entertained. For the applied Bayesian statistician, there is no need to arrive at one Bayesian inference, although such a goal may often be desirable.

Another important point for the applied Bayesian statistician concerns what is meant by “known”. In may practical problems, the number of characteristics that might be known for analysis is enormous (e.g., addresses, names and family histories of medical patients). Although some purist Bayesian positions might assert that every characteristic that is observable at essentially no cost must be treated as known, the more realistic applied position must be that there are costs associated with building complex models. Consequently “known” refers to values that are both available and considered worthwhile to include in model specifications.

Just as several specifications can be entertained by the applied Bayesian statistician, several definitions of what is known can be considered. Thus, for examples, in a completely randomized experiment, initial analyses might assume no covariates are known, a second group of analyses might assume a few obviously relevant covariates are known, and subsequent analyses might assume several other less important covariates are also known. As more covariates are considered known, the model specifications become more complicated and more difficult to formulate, but have the potential payoffs that the inferences will be more precise and specific to subpopulations defined by the covariates regarded as known.

Reference