○ 이 책은 영어로 제작되었습니다.
In the end, it seems that everything in statistics leads back to Bayes' theorem. Bayes' theorem is a method for calculating the probability or probability distribution of a cause given an observed result. In this world, everything proceeds from cause to result, and our perception also follows that natural direction. So, analyzing causes based on observed results doesn't feel intuitive. However, by using Bayes' theorem—leveraging both the statistical properties of causes and the probabilistic nature of the causal relationship from cause to result—we can tackle this unintuitive problem. The world seems to be full of results, while the causes often remain matters of speculation. For instance, we try to infer people’s mental states from their behaviors, or we use the past to predict the present and the future. In more complex domains, we estimate chromosomal abnormalities from gene expression data or infer the existence of subatomic particles from particle physics experiments. And yet, knowing the cause is essential for making better decisions and for uncovering more causal relationships—this is why we are always curious about causes. Therefore, I believe that a deep understanding of Bayes' theorem is essential for navigating this complex and unpredictable world.
Bayes' theorem, despite its significance, has a surprisingly simple mathematical form. Once you understand the basic concepts of statistics, grasping Bayes' theorem isn’t too difficult. However, it’s one of those ideas that reveal deeper meaning the more you think about it. For example, what does the statistical property or probability distribution of a cause even mean? That in itself is not always clear. Some call it the essential nature of an object—like how a fair coin is said to have a 1/2 probability of landing heads or tails. This is the view of the frequentist school. But in reality, we rarely observe such perfectly balanced probabilities. In fact, given the physical irregularities or asymmetries of coins, it’s safe to say the probability is not exactly 1/2. Still, we believe it is. This belief-driven interpretation of probability is the stance of the Bayesian school. So who’s right? According to quantum mechanics, the very act of observation can affect the result of an experiment, meaning we may never know the truth. Even such a simple concept remains only partially understood.
Furthermore, if you look at the formula, there seems to be a kind of symmetry between the direction from cause to result and from result to cause. Statistics itself does not distinguish between the two. But the real world we live in clearly differentiates cause and result—and we tend to be more curious about causes. Just like time flows from past to present under the law of entropy, despite physical laws being time-symmetric, events in our world flow from cause to result, even if Bayes’ theorem suggests a mathematical symmetry. This unresolved tension within Bayes' theorem continues to fascinate statisticians who study causal inference.
Beyond these philosophical implications, Bayes' theorem is also practically important. When we lack prior knowledge or belief about the probability of a cause, we are forced to rely only on the statistical properties from cause to result to infer the cause from the result. In such cases, we assume a uniform prior distribution for the cause—this is known as Maximum Likelihood Estimation (MLE). On the other hand, when we do have sufficient prior information about the cause, we can infer the cause from the result by weighing it equally with the statistical properties from cause to result. This is called Maximum A Posteriori estimation (MAP). In situations where we don’t even know the distribution of the result, we use an alternative known as a variational distribution to approximate the true distribution as closely as possible. In such cases, Bayes’ theorem is expressed using the Evidence Lower Bound (ELBO) and Kullback-Leibler (KL) divergence. These concepts are central in information theory and machine learning. Examples also include the assumption of uniform distribution for p-values in statistical testing, or how next-generation AI systems based on logical reasoning use categorical priors instead of statistical ones—these are all representative applications of Bayes' theorem in advanced statistical theory.
This book is divided into separate sections on the population and the sample group and deals with fundamental concepts of statistics. But because the ideas are organized so coherently, readers will be able to see how Bayes' theorem—despite its simple formula—appears in various contexts within statistics. The author hopes this will help readers take one step closer to the deep mysteries of the world we live in.