Day 3

MATH 313: Survey Design and Samping

Bastola

Review of Key Statistical Concepts

Core Parameters and Statistics
- Expected Value
- Variance
- Standard Deviation
Concept of Sampling
- Using sample statistics to approximate population parameters.

Understanding Sampling Distribution

Accuracy of Approximation
- How accurately do sample statistics estimate population parameters?
Need for Further Exploration
- Introduction to the concept of sampling distribution to assess the reliability of statistical approximations.

Sampling Distribution

Definition

Sampling Distribution: The sampling distribution of a sample statistic \(\hat{\theta}\) calculated from a sample of \(n\) measurements is probabiity distribution of that statistic \(\hat{\theta}\)

Estimator

Definition

An estimator is a function that uses sample data to approximate a population parameter. Commonly, the sample mean (\(\bar{x}\)) serves as an estimator for the population mean (\(\mu\)).

Example: Estimating Average Height

Consider estimating the average height (\(\mu\)) of students. A sample of 5 students yielded these heights in inches: 60, 62, 65, 58, 66.

Calculation

The sample mean (\(\bar{y}\)) is calculated as:

\[ \bar{y} = \frac{\sum_{i=1}^{n} y_i}{n} = \frac{60 + 62 + 65 + 58 + 66}{5} = 62.2 \text{ inches} \]

This computed \(\bar{y} = 62.2\) inches serves as our estimate for the average student height (\(\mu\)).

Properties of the Sampling Distribution of \(\bar{x}\)

The mean of the sampling distribution of \(\bar{y}\) equals to the mean of the sampled population. That is \(E(\bar{y})=\mu\)
The variance of the sampling distribution of \(\bar{y}\) is given by

\[V(\bar{y})=\frac{\sigma^2}{n}\]

\(S D(\bar{y})=\sqrt{\frac{\sigma^2}{n}}=\frac{\sigma}{\sqrt{n}}\) The standard deviation \(\sigma_{\bar{y}}\) is often referred to as the Standard Error.

Central Limit Theorem

Theorem: Consider a random sample of \(n\) observations selected from a population (any population/probability distribution) with mean \(\mu\) and variance \(\sigma^2\). Then when \(n\) is sufficiently large, the sampling distribution of \(\bar{y}\) is approximately a normal distribution with mean \(E(\bar{y})=\mu\), and \(SD(\bar{y})=\frac{\sigma}{\sqrt{n}}\).

The larger \(n\) is, the more closely the sampling distribution will approximate a normal shape.

Recall the simulation (Slide 4)

Suppose we have selected a random sample of \(n=50\) observations from a population with mean \(\mu=4\) and variance \(\sigma^2=8\).

What is the value of the mean \(E(\bar{y})\) and standard deviation (standard error) \(\sigma_{\bar{y}}=\sqrt{V(\bar{y})}\) of the sampling distribution of the sample mean \(\bar{y}\).

Use the sample data of size 50, compute the mean and standard deviation of \(\bar{y}\). Compare the results with a.

What is the probability that the sample mean is greater than 5?