Bayes' Theorem

by aoum, Mar 16, 2025, 10:27 PM

Bayes' Theorem: Understanding Conditional Probability

Bayes' Theorem is a fundamental result in probability theory and statistics that describes how to update the probability of a hypothesis based on new evidence. It provides a mathematical framework for reasoning about uncertainty and is widely used in fields such as medicine, machine learning, and finance.

https://upload.wikimedia.org/wikipedia/commons/thumb/c/c9/Bayes_theorem_visual_proof.svg/170px-Bayes_theorem_visual_proof.svg.png

1. What is Bayes' Theorem?

Bayes' Theorem relates the conditional probability of event $A$ given $B$ to the conditional probability of $B$ given $A$. Mathematically, it is expressed as:

\[
P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)},
\]
where:
  • $P(A \mid B)$ = Probability of event $A$ occurring given that $B$ has occurred (posterior probability).
  • $P(B \mid A)$ = Probability of event $B$ occurring given that $A$ is true (likelihood).
  • $P(A)$ = Prior probability of event $A$ (before considering evidence).
  • $P(B)$ = Total probability of event $B$ (marginal likelihood).

Bayes' Theorem allows us to reverse conditional probabilities and update our beliefs when new information becomes available.

2. Understanding the Intuition Behind Bayes' Theorem

Suppose we want to know the probability that a patient has a particular disease given a positive test result. Bayes' Theorem helps us calculate this by combining:
  • How accurate the test is (likelihood).
  • The prevalence of the disease (prior probability).
  • The overall probability of a positive test (marginal probability).

By updating our prior knowledge with new evidence, we refine our estimate of the true probability.

3. Derivation of Bayes' Theorem

From the definition of conditional probability:

\[
P(A \mid B) = \frac{P(A \cap B)}{P(B)}, \quad P(B \mid A) = \frac{P(A \cap B)}{P(A)},
\]
Since $P(A \cap B) = P(B \mid A) P(A)$, substituting this into the first equation gives:

\[
P(A \mid B) = \frac{P(B \mid A) P(A)}{P(B)}.
\]
To compute $P(B)$ (the denominator), we use the Law of Total Probability:

\[
P(B) = P(B \mid A) P(A) + P(B \mid \neg A) P(\neg A),
\]
where $\neg A$ represents the complement of $A$.

4. An Example: Medical Testing

A diagnostic test for a rare disease is $99\%$ accurate. If $0.1\%$ of the population has the disease, what is the probability that a randomly chosen person who tests positive actually has the disease?

Let:
  • $A =$ "Has disease"
  • $B =$ "Tests positive"
  • $P(A) = 0.001$ (prevalence of disease)
  • $P(B \mid A) = 0.99$ (true positive rate)
  • $P(B \mid \neg A) = 0.01$ (false positive rate)

By the Law of Total Probability:

\[
P(B) = P(B \mid A)P(A) + P(B \mid \neg A)P(\neg A),
\]
\[
P(B) = (0.99)(0.001) + (0.01)(0.999) = 0.00099 + 0.00999 = 0.01098.
\]
Now apply Bayes' Theorem:

\[
P(A \mid B) = \frac{P(B \mid A) P(A)}{P(B)} = \frac{(0.99)(0.001)}{0.01098} \approx 0.0902.
\]
Even with a positive test result, the actual probability of having the disease is only about $9.02\%$ due to the low prevalence of the disease.

5. General Form of Bayes' Theorem for Multiple Hypotheses

If we have a set of mutually exclusive and exhaustive hypotheses $H_1, H_2, \dots, H_n$, Bayes' Theorem generalizes to:

\[
P(H_i \mid B) = \frac{P(B \mid H_i) P(H_i)}{\sum_{j=1}^{n} P(B \mid H_j) P(H_j)}.
\]
This form is useful in applications like machine learning and Bayesian inference, where multiple outcomes are possible.

6. Applications of Bayes' Theorem

Bayes' Theorem has broad applications across disciplines:
  • Medical Diagnosis: Updating disease probabilities based on test results.
  • Spam Filtering: Classifying emails as spam or not spam based on word frequency.
  • Machine Learning: Bayesian classifiers estimate probabilities from data.
  • Forensic Science: Evaluating the likelihood of evidence supporting guilt or innocence.
  • Finance: Updating risk assessments based on new market data.

7. Bayesian Inference

Bayesian inference applies Bayes' Theorem to statistical problems. Given observed data $D$, we update our belief about a model parameter $\theta$:

\[
P(\theta \mid D) = \frac{P(D \mid \theta) P(\theta)}{P(D)},
\]
Where:
  • $P(\theta)$ is the prior (belief before data).
  • $P(D \mid \theta)$ is the likelihood (how likely data is under the model).
  • $P(D)$ is the evidence (total probability of the data).
  • $P(\theta \mid D)$ is the posterior (updated belief after data).

8. Common Misconceptions About Bayes' Theorem
  • Confusing $P(A \mid B)$ with $P(B \mid A)$: The two are rarely equal.
  • Ignoring Base Rates: Failure to account for the prior can lead to incorrect conclusions.
  • Overestimating Rare Event Probabilities: Even with accurate tests, the actual likelihood may remain low.

9. Example Problem Using Bayesian Inference

Suppose $5\%$ of students cheat on exams. A new detection system catches cheaters $95\%$ of the time but also falsely accuses innocent students $10\%$ of the time. If a student is flagged, what is the probability they actually cheated?

Let:
  • $A =$ "Student cheats" ($P(A) = 0.05$)
  • $B =$ "Flagged by system"
  • $P(B \mid A) = 0.95$ (true positive rate)
  • $P(B \mid \neg A) = 0.10$ (false positive rate)

By the Law of Total Probability:

\[
P(B) = P(B \mid A)P(A) + P(B \mid \neg A)P(\neg A),
\]
\[
P(B) = (0.95)(0.05) + (0.10)(0.95) = 0.0475 + 0.095 = 0.1425.
\]
Using Bayes' Theorem:

\[
P(A \mid B) = \frac{P(B \mid A)P(A)}{P(B)} = \frac{(0.95)(0.05)}{0.1425} \approx 0.333.
\]
If a student is flagged, there is only a $33.3\%$ chance they actually cheated.

10. Conclusion

Bayes' Theorem is a powerful tool for updating probabilities when new information arises. It forms the foundation of Bayesian statistics and has applications in diverse fields, from medicine to artificial intelligence.

References

Comment

0 Comments

Fun with Math and Science!

avatar

aoum
Archives
+ March 2025
Shouts
Submit
  • Does anyone know how to make the body wider in CSS?

    by aoum, Yesterday at 12:53 AM

  • Thanks! I'm happy to hear that! I'll try to modify the CSS so that the body is wider. How wide would you like it to be?

    by aoum, Yesterday at 12:43 AM

  • This is such a cool blog! Just a suggestion, but I feel like it would look a bit better if the entries were wider. They're really skinny right now, which makes the posts seem a lot longer.

    by Catcumber, Friday at 11:16 PM

  • The first few posts for April are out!

    by aoum, Apr 1, 2025, 11:51 PM

  • Sure! I understand that it would be quite a bit to take in.

    by aoum, Apr 1, 2025, 11:08 PM

  • No, but it is a lot to take in. Also, could you do the Gamma Function next?

    by HacheB2031, Apr 1, 2025, 3:04 AM

  • Am I going too fast? Would you like me to slow down?

    by aoum, Mar 31, 2025, 11:34 PM

  • Seriously, how do you make these so fast???

    by HacheB2031, Mar 31, 2025, 6:45 AM

  • I am now able to make clickable images in my posts! :)

    by aoum, Mar 29, 2025, 10:42 PM

  • Am I doing enough? Are you all expecting more from me?

    by aoum, Mar 29, 2025, 12:31 AM

  • That's all right.

    by aoum, Mar 28, 2025, 10:46 PM

  • sorry i couldn't contribute, was working on my own blog and was sick, i'll try to contribute more

    by HacheB2031, Mar 28, 2025, 2:41 AM

  • Nice blog!
    I found it through blogroll.

    by yaxuan, Mar 26, 2025, 5:26 AM

  • How are you guys finding my blog?

    by aoum, Mar 24, 2025, 4:50 PM

  • insanely high quality!

    by clarkculus, Mar 24, 2025, 3:20 AM

51 shouts
Contributors
Tags
Problem of the Day
Fractals
geometry
poll
Collatz Conjecture
Millennium Prize Problems
pi
Riemann Hypothesis
Sir Issac Newton
AMC
calculus
Chudnovsky Algorithm
Exponents
Factorials
Gauss-Legendre Algorithm
Goldbach Conjecture
infinity
Koch snowflake
MAA
Mandelbrot Set
Mastering AMC 1012
MATHCOUNTS
Matroids
Nilakantha Series
P vs NP Problem
Polynomials
probability
Algorithmic Applications
AMC 10
AMC 8
angle bisector theorem
Angle trisection
Applications in Various Fields
Arc Sine Formula
Archimedes Method
Banach-Tarski Paradox
Basel Problem
Basic Reproduction Number
Bayes Theorem
Bernoulli numbers
Bertrand s Box Paradox
binomial theorem
Birthday Attack
Birthday Problem
buffon s needle
Cantor s Infinite Sets
cardinality
catalan numbers
Circumference
Coin Rotation Paradox
combinatorics
computer science
conditional probability
conic sections
Conjectures
Cryptography
Cyclic Numbers
Different Sizes of Infinity
Diseases
Double Factorials
Drake Equation
epidemiology
Euler s Formula for Polyhedra
Euler s Identity
Euler s totient function
Euler-Lagrange Equation
Fermat s Factoring Method
fermat s last theorem
Fibonacci sequence
finite
four color theorem
Fractals and Chaos Theory
free books
Gamma function
Golden Ratio
graph theory
gravity
Greedoids
Gregory-Liebniz Series
Hailstone Problem
Heron s Formula
Hilbert s Hotel
Hodge Conjecture
ideal gas law
Inclusion-exclusion
infinite
Irrational numbers
Law of Force and Acceleration
Leibniz Formula
logarithms
Mastering AMC 8
Menger Sponge
Minkowskis Theorem
modular arithmetic
Multinomial Theorem
Multiples of 24
National Science Bowl
Newton s First Law of Motion
Newton s Second Law of Motion
Newton s Third Law of Motion
P-adic Analysis
Parabolas
Paradox
paradoxes
Penrose Tilings
physical chemistry
pie
Price s Equation
prime numbers
Pythagorean Theorem
Python
Ramsey s Theorem
Ramsey Theory
Reproduction Rate of Diseases
Sequences
Sequences of Binomial Type
Sets
Sierpinski Triangle
Simon s Factoring Trick
The Birthday Problem
The Book of Formulas
The HalesJewett Theorem
The Law of Action and Reaction
The Law of Inertia
The Lost Boarding Pass Problem
thermodynamics
Topological Insights
triangle inequality
trigonometry
twin prime conjecture
Umbral Calculus
Van der Waerdens Theorem
venn diagram
Wallis Product
Zeno s Paradoxes
About Owner
  • Posts: 0
  • Joined: Nov 2, 2024
Blog Stats
  • Blog created: Mar 1, 2025
  • Total entries: 81
  • Total visits: 631
  • Total comments: 25
Search Blog
a