| Prediction |
Truth
|
Total | |
|---|---|---|---|
| ham | spam | ||
| ham | 2,280 | 368 | 2,648 |
| spam | 1,843 | 282 | 2,125 |
| Total | 4,123 | 650 | 4,773 |
STAT1003 – Statistical Techniques
Dr. Emi Tanaka
Australian National University
These slides are best viewed on a modern browser like Google Chrome on a desktop or laptop. Some interactive components may require some time to fully load.
Example spam SMS
Last chance 2 claim ur £150 worth of discount vouchers-Text YES to 85023 now!SavaMob-member offers mobile T Cs 08717898035. £3.00 Sub. 16 . Remove txt X or STOP
Example SMS (not spam)
I dun believe u. I thk u told him.
| Prediction |
Truth
|
Total | |
|---|---|---|---|
| ham | spam | ||
| ham | 2,280 | 368 | 2,648 |
| spam | 1,843 | 282 | 2,125 |
| Total | 4,123 | 650 | 4,773 |
| Prediction |
Truth
|
Total | |
|---|---|---|---|
| ham | spam | ||
| ham | 0.861 | 0.139 | 1.000 |
| spam | 0.867 | 0.133 | 1.000 |
| Prediction |
Truth
|
|
|---|---|---|
| ham | spam | |
| ham | 0.553 | 0.566 |
| spam | 0.447 | 0.434 |
| Total | 1.000 | 1.000 |
| Prediction |
Truth
|
Total | |
|---|---|---|---|
| ham | spam | ||
| ham | 0.478 | 0.077 | 0.555 |
| spam | 0.386 | 0.059 | 0.445 |
| Total | 0.864 | 0.136 | 1.000 |
A joint probability of events \(A\) and \(B\) is the probability that both events occur together, denoted by \(P(A \cap B)\).
| Prediction |
Truth
|
Total | |
|---|---|---|---|
| ham | spam | ||
| ham | 0.478 | 0.077 | 0.555 |
| spam | 0.386 | 0.059 | 0.445 |
| Total | 0.864 | 0.136 | 1.000 |
A marginal probability is the probability of a single event occurring, regardless of the outcomes of other variables.
| Prediction |
Truth
|
Total | |
|---|---|---|---|
| ham | spam | ||
| ham | 0.478 | 0.077 | 0.555 |
| spam | 0.386 | 0.059 | 0.445 |
| Total | 0.864 | 0.136 | 1.000 |
\[P(\underbrace{\text{predicted as ham}}_{\large A})=P(A \cap \underbrace{\text{is a ham}}_{\large B})+P(A \cap \underbrace{\text{is a spam}}_{\large B^c})\]
Suppose that the events \(B_{1}, B_{2}, \ldots, B_{n}\) are a partition of the sample space. That is:
Then for any event \(A\), the following is true: \[P(A)=\sum_{i=1}^{n} P\left(A \cap B_{i}\right).\]
This is referred to as the law of total probability.
For two events \(A\) and \(B\), the conditional probability of \(A\) given that \(B\) has occurred, denoted by \(P(A \mid B)\), is defined to be:
\[P(A \mid B)=\frac{P(A \cap B)}{P(B)}\]
\[P(\text{Truth} \mid \text{Prediction})\]
| Prediction |
Truth
|
Total | |
|---|---|---|---|
| ham | spam | ||
| ham | 0.861 | 0.139 | 1.000 |
| spam | 0.867 | 0.133 | 1.000 |
\[P(\text{Prediction} \mid \text{Truth})\]
| Prediction |
Truth
|
|
|---|---|---|
| ham | spam | |
| ham | 0.553 | 0.566 |
| spam | 0.447 | 0.434 |
| Total | 1.000 | 1.000 |
Note that \(P(A \mid B) \ne P(B \mid A)\) in general.
If a SMS is predicted to be spam, what is the probability that the prediction is correct?
You are rolling a die. You are told you rolled an even number.
| Prediction |
Truth
|
Total | |
|---|---|---|---|
| ham | spam | ||
| ham | 0.478 | 0.077 | 0.555 |
| spam | 0.386 | 0.059 | 0.445 |
| Total | 0.864 | 0.136 | 1.000 |
Solution:
Two events \(A\) and \(B\) are independent if and only if:
\[P(A \cap B)=P(A) \times P(B).\]
Notice if \(A\) and \(B\) are independent: \[\begin{equation*} \begin{array}{l} P(A \mid B)=\frac{P(A \cap B)}{P(B)}=\frac{P(A) \times P(B)}{P(B)}=P(A) \\ P(B \mid A)=\frac{P(B \cap A)}{P(A)}=\frac{P(B) \times P(A)}{P(A)}=P(B) \end{array} \end{equation*}\]
That is, two events are independent if the probability of one event occurring is not affected by the occurrence of the other event.
Consider rolling two dice, what is the probability of getting two 1s?
If we shuffle up a deck of cards and draw one, is the event that the card is a heart independent of the event that the card is an ace?
A streaming service reports that 25% of its users watch documentaries and 60% of its users watch movies. If 15% of users watch both documentaries and movies, are the events “a user watches documentaries” and “a user watches movies” independent?
For two events \(A\) and \(B\), the multiplication rule states that \[\begin{align*} P(A\cap B) &= P(A\mid B)\times P(B)\\ &= P(B\mid A)\times P(A) \end{align*}\]
| Result |
Inoculated
|
Total | |
|---|---|---|---|
| no | yes | ||
| died | 844 | 6 | 850 |
| lived | 5136 | 238 | 5374 |
| Total | 5980 | 244 | 6224 |
What is the probability that a randomly selected person who was not inoculated died from smallpox?
\[P(\text{died}\mid \text{not inoculated}) = \frac{P(\text{died}\cap \text{not inoculated})}{P(\text{not inoculated})} = \frac{844/6224}{5980/6224} = 0.1411.\]
What is the probability that an inoculated person died from smallpox?
\[P(\text{died}\mid \text{inoculated}) = \frac{P(\text{died}\cap \text{inoculated})}{P(\text{inoculated})} = \frac{6/6224}{244/6224} = 0.0246.\]
Let \(A_{1}, \ldots, A_{n}\) represent a set of disjoint and exhaustive events, i.e.,
Then for event \(B\),
\[ P\left(A_{1} \mid B\right)+\cdots+P\left(A_{n} \mid B\right) = 1 \]
For two events \(A\) and \(B\), the Bayes’ theorem states \[P(A\mid B) = \frac{P(B|A)P(A)}{P(B)}.\]
\[\begin{align*} &= \frac{P(\text{User}\cap \text{Positive Test})}{P(\text{Positive Test})}\\ &= \frac{P(\text{Positive Test}\mid \text{User})P(\text{User})}{P(\text{Positive Test}\mid \text{User})P(\text{User}) + P(\text{Positive Test}\mid \text{User}^c)P(\text{User}^c)}\\ &= \frac{0.95\times 0.03}{0.95\times 0.03+(1-0.9)\times (1-0.03)}\\ &= 0.2271 \end{align*}\]

STAT1003 – Statistical Techniques