Probability Models

Klinkenberg

University of Amsterdam

2024-02-12

Exact approach

Coin values

Lets start simple and throw only 2 times with a fair coin. Assigning 1 for heads and 0 for tails.

The coin can only have the values 0, 1, heads or tails.

Permutation

If we throw 2 times we have the following possible outcomes.

  Toss1 Toss2
1     0     0
2     1     0
3     0     1
4     1     1

Number of heads

With frequency of heads being

  Toss1 Toss2 frequency
1     0     0         0
2     1     0         1
3     0     1         1
4     1     1         2

Probabilities

For each coin toss, disregarding the outcom, there is a .5 probability of landing heads.

  Toss1 Toss2
1   0.5   0.5
2   0.5   0.5
3   0.5   0.5
4   0.5   0.5

So for each we can specify the total probability by applying the product rule (e.g. multiplying the probabilities)

  Toss1 Toss2 probability
1   0.5   0.5        0.25
2   0.5   0.5        0.25
3   0.5   0.5        0.25
4   0.5   0.5        0.25

Which is the same for all outcomes.

Discrete probabilities

Though some outcomes occurs more often. Throwing 0 times heads, only occurs once and hence has a probability of .25. But throwing 1 times heads, can occur in two situations. So, for this situation we can add up the probabilities.

  Toss1 Toss2 frequency probability
1     0     0         0        0.25
2     1     0         1        0.25
3     0     1         1        0.25
4     1     1         2        0.25

Frequecy and probability distribution

10 tosses

Toss1 Toss2 Toss3 Toss4 Toss5 Toss6 Toss7 Toss8 Toss9 Toss10
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
Toss1 Toss2 Toss3 Toss4 Toss5 Toss6 Toss7 Toss8 Toss9 Toss10 probability
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.0009766
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.0009766
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.0009766
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.0009766
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.0009766
#Heads frequencies Probabilities
0 1 0.0009766
1 10 0.0097656
2 45 0.0439453
3 120 0.1171875
4 210 0.2050781
5 252 0.2460938
6 210 0.2050781
7 120 0.1171875
8 45 0.0439453
9 10 0.0097656
10 1 0.0009766

Binomial distribution

Calculate binomial probabilities

\[ {n\choose k}p^k(1-p)^{n-k}, \small {n\choose k} = \frac{n!}{k!(n-k)!} \]

n = 10   # Sample size
k = 0:10 # Discrete probability space
p = .5   # Probability of head
n k p n! k! (n-k)! (n over k) p^k (1-p)^(n-k) Binom Prob
10 0 0.5 3628800 1 3628800 1 1.0000000 0.0009766 0.0009766
10 1 0.5 3628800 1 362880 10 0.5000000 0.0019531 0.0097656
10 2 0.5 3628800 2 40320 45 0.2500000 0.0039063 0.0439453
10 3 0.5 3628800 6 5040 120 0.1250000 0.0078125 0.1171875
10 4 0.5 3628800 24 720 210 0.0625000 0.0156250 0.2050781
10 5 0.5 3628800 120 120 252 0.0312500 0.0312500 0.2460938
10 6 0.5 3628800 720 24 210 0.0156250 0.0625000 0.2050781
10 7 0.5 3628800 5040 6 120 0.0078125 0.1250000 0.1171875
10 8 0.5 3628800 40320 2 45 0.0039063 0.2500000 0.0439453
10 9 0.5 3628800 362880 1 10 0.0019531 0.5000000 0.0097656
10 10 0.5 3628800 3628800 1 1 0.0009766 1.0000000 0.0009766

Bootstrapping

Sampling from your sample to approximate the sampling distribution.

My Coin tosses

my.tosses = c(0,1,0,1,0,0,0,0,0,0)

Sample from the sample

Sampling with replacement

sample(my.tosses, replace = TRUE)
 [1] 0 0 1 1 0 1 0 0 0 0
sample(my.tosses, replace = TRUE)
 [1] 0 1 0 0 0 1 0 0 0 0
sample(my.tosses, replace = TRUE)
 [1] 0 0 0 0 0 0 1 0 0 0
sample(my.tosses, replace = TRUE)
 [1] 1 1 0 0 0 0 0 0 0 0

Sampling from the sample

n.samples = 1000
n.heads = vector()

for (i in 1:n.samples) {
  my.sample <- sample(my.tosses, replace = TRUE)
  
  n.heads[i] <- sum(my.sample) 
}
4 3 1 1 4 2 3 0 3 1 4 2 2 1 3 0 0 2 1 1 3 3 5 1 3 3 3 1 2 3 1 2 1 2 2 0 1 2 2 2
3 1 3 2 2 4 1 2 2 1 0 0 1 2 3 4 1 2 1 0 4 2 2 2 3 2 2 1 3 2 2 1 1 2 2 1 2 0 2 3
2 2 4 1 4 3 2 3 1 1 3 4 2 0 2 0 1 2 2 2 0 2 3 3 3 2 2 2 3 3 2 1 2 2 1 0 3 1 2 3
2 4 2 4 1 3 3 3 2 5 2 2 2 1 2 0 1 0 1 2 0 3 0 4 1 2 2 2 2 0 4 2 1 1 1 1 2 2 2 0
0 2 3 1 0 2 2 1 1 1 2 3 0 1 3 2 2 1 1 1 1 2 2 1 1 3 2 1 2 1 2 3 1 1 3 2 1 3 3 0
3 1 3 1 3 1 1 2 1 5 4 4 3 1 4 4 2 3 3 2 1 2 3 3 1 3 1 1 1 1 4 2 2 2 3 3 3 3 0 0
1 0 3 3 1 2 5 3 2 3 1 3 1 2 3 0 2 1 3 0 1 2 1 3 2 4 1 2 5 2 1 2 2 1 1 2 2 5 3 2
3 3 1 4 1 2 4 2 1 2 1 3 5 3 2 2 5 1 2 1 0 5 2 3 3 2 2 2 3 2 2 3 2 1 1 4 2 3 4 2
0 4 1 1 1 2 0 2 1 1 3 3 2 3 1 1 1 2 3 2 1 2 4 0 1 2 4 3 1 2 2 1 2 1 0 2 1 3 2 3
1 1 5 0 2 2 0 2 3 3 1 2 2 5 3 1 1 2 3 1 3 3 2 1 2 3 1 2 1 3 2 2 1 0 0 0 2 3 1 2
1 1 2 1 3 3 0 3 1 2 0 2 4 1 3 3 4 3 0 2 2 2 2 2 1 2 2 2 1 3 0 0 2 3 5 2 0 2 1 2
0 2 0 2 0 2 3 4 1 2 2 1 0 2 2 3 3 0 3 2 3 3 1 3 2 3 1 3 2 3 1 2 2 1 3 4 3 2 3 1
0 1 3 4 3 3 3 3 2 2 2 1 1 2 0 0 2 3 2 2 1 4 1 3 3 2 2 2 3 0 2 1 4 0 5 3 4 0 3 1
1 2 1 1 2 1 2 2 3 0 3 4 2 5 2 2 1 3 1 1 1 2 3 1 3 3 3 1 3 1 1 2 1 0 1 0 1 3 1 1
3 0 0 2 1 1 0 1 2 3 2 0 3 0 4 1 1 3 4 2 4 4 1 1 2 4 4 1 3 1 1 0 4 3 3 3 2 2 2 2
3 1 1 3 3 3 2 2 5 3 2 4 5 5 3 3 2 4 3 1 1 1 2 2 2 3 2 1 1 1 0 4 1 4 2 1 5 2 2 2
1 1 1 1 2 1 2 2 2 0 4 1 3 0 2 2 0 2 0 1 4 4 3 1 3 1 1 2 3 1 2 2 2 3 3 1 3 0 1 5
1 2 1 1 2 1 3 1 3 2 1 2 3 3 4 0 2 0 2 0 2 2 2 0 3 3 0 4 2 2 2 3 1 3 0 0 3 1 1 2
1 2 2 3 1 3 2 2 2 2 1 5 1 2 3 3 3 1 1 4 1 0 3 2 2 2 2 6 5 1 4 1 0 3 1 3 1 2 3 2
1 4 4 1 3 3 2 6 0 1 1 3 3 1 1 4 3 1 1 3 3 3 1 3 3 0 2 2 2 2 2 0 3 1 3 0 2 1 3 1
0 2 1 0 2 1 2 2 2 3 5 2 2 1 1 2 3 2 0 1 0 1 3 2 2 2 4 2 1 3 1 1 4 5 0 2 3 3 1 3
0 3 1 4 2 2 1 2 2 2 0 1 3 2 6 3 1 4 2 2 1 2 1 2 1 2 2 2 2 2 4 1 1 2 1 1 5 5 3 3
4 4 1 0 3 2 2 3 3 2 2 0 1 1 1 1 3 1 1 1 1 1 4 3 2 1 2 1 2 3 5 1 1 1 4 2 1 1 2 1
4 5 2 2 3 4 2 2 0 1 4 3 1 0 2 4 3 2 5 3 0 0 2 3 2 0 3 2 0 4 1 0 1 3 2 2 4 4 1 4
2 0 2 3 3 3 2 3 0 3 2 0 1 2 2 2 7 3 2 2 1 3 1 2 1 0 2 3 2 0 5 2 6 2 1 3 1 2 4 0

Frequencies

Frequencies for number of heads per sample.

0 1 2 3 4 5 6 7 8 9 10
Freq 104 263 310 215 74 29 4 1 0 0 0

Bootstrapped sampling distribution

Theoretical Approximations

Continuous Probability distirbutions

For all continuous probability distributions:

  • Total area is always 1
  • The probability of one specific test statistic is 0
  • x-axis represents the test statistic
  • y-axis represents the probability density

T-distribution

Gosset

William Sealy Gosset (aka Student) in 1908 (age 32)

In probability and statistics, Student’s t-distribution (or simply the t-distribution) is any member of a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown.

In the English-language literature it takes its name from William Sealy Gosset’s 1908 paper in Biometrika under the pseudonym “Student”. Gosset worked at the Guinness Brewery in Dublin, Ireland, and was interested in the problems of small samples, for example the chemical properties of barley where sample sizes might be as low as 3 (Wikipedia, 2024).

Population distribution

layout(matrix(c(2:6,1,1,7:8,1,1,9:13), 4, 4))

n  = 56    # Sample size
df = n - 1 # Degrees of freedom

mu    = 120
sigma = 15

IQ = seq(mu-45, mu+45, 1)

par(mar=c(4,2,2,0))  
plot(IQ, dnorm(IQ, mean = mu, sd = sigma), type='l', col="red", main = "Population Distribution")

n.samples = 12

for(i in 1:n.samples) {
  
  par(mar=c(2,2,2,0))  
  hist(rnorm(n, mu, sigma), main="Sample Distribution", cex.axis=.5, col="beige", cex.main = .75)
  
}

Population distribution

One sample

Let’s take a larger sample from our normal population.

x = rnorm(n, mu, sigma); x
 [1] 107.23685 127.41511 114.49546 117.95965 114.28153 103.07176 111.25836
 [8] 113.32382 119.02879 106.59907 126.02548 148.09688 100.41463 115.35363
[15] 110.83483 119.10494 145.80933 112.99751 124.73989  96.95211 133.51865
[22] 115.11211 112.44197 133.23333 127.44063 117.51603 110.32868  97.04325
[29] 107.35497 144.51702 112.78471 111.58268 128.19846 103.32059 116.82780
[36] 105.43609 117.68510  99.12848 113.08409 120.31098  84.90971 128.78364
[43] 141.29826 113.82271  88.68238 132.32723 116.03553 124.13257 107.46536
[50] 122.29764 131.74950 115.54641  92.64111  95.68900 107.86958 102.73153

More samples

let’s take more samples.

n.samples     = 1000
mean.x.values = vector()
sd.x.values   = vector()
se.x.values   = vector()

for(i in 1:n.samples) {
  x = rnorm(n, mu, sigma)
  mean.x.values[i] = mean(x)
  se.x.values[i]   = (sd(x) / sqrt(n))
  sd.x.values[i]   = sd(x)
}

Mean and SE for all samples

mean.x.values se.x.values
119.7035 2.155462
123.3844 1.795654
118.4070 2.087566
119.0329 2.302962
119.3704 2.487849
121.0642 2.106344

Sampling distribution

of the mean

T-statistic

\[T_{n-1} = \frac{\bar{x}-\mu}{SE_x} = \frac{\bar{x}-\mu}{s_x / \sqrt{n}}\]

So the t-statistic represents the deviation of the sample mean \(\bar{x}\) from the population mean \(\mu\), considering the sample size, expressed as the degrees of freedom \(df = n - 1\)

T-value

\[T_{n-1} = \frac{\bar{x}-\mu}{SE_x} = \frac{\bar{x}-\mu}{s_x / \sqrt{n}}\]

t = (mean(x) - mu) / (sd(x) / sqrt(n))
t
[1] 0.3717045

Calculate t-values

\[T_{n-1} = \frac{\bar{x}-\mu}{SE_x} = \frac{\bar{x}-\mu}{s_x / \sqrt{n}}\]

t.values = (mean.x.values - mu) / se.x.values
        mean.x.values  mu se.x.values    t.values
 [995,]      123.4837 120    2.404987  1.44853662
 [996,]      117.4054 120    1.873479 -1.38491533
 [997,]      118.9022 120    1.772061 -0.61949536
 [998,]      119.8276 120    2.200470 -0.07835800
 [999,]      120.0845 120    1.629206  0.05185194
[1000,]      120.7386 120    1.986962  0.37170451

Sampling distribution t-values

T-distribution

So if the population is normaly distributed (assumption of normality) the t-distribution represents the deviation of sample means from the population mean (\(\mu\)), given a certain sample size (\(df = n - 1\)).

The t-distibution therefore is different for different sample sizes and converges to a standard normal distribution if sample size is large enough.

The t-distribution is defined by the probability density function (PDF):

\[\textstyle\frac{\Gamma \left(\frac{\nu+1}{2} \right)} {\sqrt{\nu\pi}\,\Gamma \left(\frac{\nu}{2} \right)} \left(1+\frac{x^2}{\nu} \right)^{-\frac{\nu+1}{2}}\!\]

where \(\nu\) is the number of degrees of freedom and \(\Gamma\) is the gamma function (Wikipedia, 2024).

Warning

Formula not exam material

One or two sided

Two sided

  • \(H_A: \bar{x} \neq \mu\)

One sided

  • \(H_A: \bar{x} > \mu\)
  • \(H_A: \bar{x} < \mu\)

End

Contact

CC BY-NC-SA 4.0

References

Wikipedia. (2024). Student’s t-distributionWikipedia, the free encyclopedia. http://en.wikipedia.org/w/index.php?title=Student's%20t-distribution&oldid=1202978121.