Day 10

MATH 313: Survey Design and Sampling

Bastola

Introduction to Comparing Estimators

  • Overview: Comparing estimates like means or proportions helps us understand differences between groups or changes over time.
  • Applications: From socio-economic studies to political analyses, comparing estimators is pivotal in interpreting data.

Comparing Means of Two Populations

  • Context: Compare the mean incomes of two ethnic groups using samples from each group.
  • Calculation: \(\bar{X}\) and \(\bar{Y}\) represent the sample means, where comparisons are made by analyzing the difference \(\mu_y - \mu_x\).

Statistical Methodology

  • Formula: The difference in sample means is used as an estimator for the difference in population means.
  • Variance Consideration: Compute sample variances \(s_X^2\) and \(s_Y^2\) to assess variability within each sample.

Example: Political Preference Analysis

  • Scenario: Evaluate whether Republicans are gaining on Democrats by comparing the proportions of voters in two different polls.
  • Comparison Metric: Look at the difference between the proportions voting for each party.

Understanding Differences

  • Insights: These comparisons help identify significant trends or disparities, guiding strategic decisions and policy formulations.
  • Practical Application: Enables researchers and analysts to make informed conclusions based on empirical data.

Estimation of Difference Between Means

  • Estimator of Difference:
    • \(\hat{\Delta} = \bar{y} - \bar{x}\), where \(\hat{\Delta}\) estimates \(\mu_y - \mu_x\).
  • Estimated Variance of the Difference:
    • \(\hat{V}(\hat{\Delta}) = \frac{s_y^2}{n_1} + \frac{s_x^2}{n_2}\), assuming samples are independent.
  • Error Bound on Estimation (\(1-\alpha\) confidence level):
    • \(B = t_{1-\frac{\alpha}{2}, df} \cdot \sqrt{\hat{V}(\hat{\Delta})}\)
    • Degrees of freedom (df): \(df = \frac{\left(\frac{s_y^2}{n_1} + \frac{s_x^2}{n_2}\right)^2}{\frac{(s_y^2/n_1)^2}{n_1-1} + \frac{(s_x^2/n_2)^2}{n_2-1}}\)

Example 1: A drug company has developed a new drug that is designed to reduce high blood pressure. To test the drug, a sample of 15 patients is recruited to take drug. Their systolic blood pressures are reduced by an average of 28.3 millimeters, with a standard deviation of 12.0 millimeters. In addition, another sample of 20 patients takes a standard drug. The blood pressures in this group are reduced by an average of 17.1 millimeters with a standard deviation of 9.0 millimeters. Assume that blood pressure reductions are approximately normally distributed, answer the following questions.

  1. Estimate the difference between the population mean reduction for the new drug and that of the standard drug.



  1. Compute the \(95 \%\) bounds on the error of estimation.

  1. Compute the \(95 \%\) confidence interval for the estimation.
  1. Based on the result of c , can you draw a conclusion that there is a significant difference in mean

Estimation of \(p_1-p_2\) for Independent Samples

  • Estimator for Difference Between Two Proportions:
    • \(\hat{p}_1-\hat{p}_2\), estimates the difference between proportions from two independent samples.
  • Estimated Variance of the Difference:
    • \(\hat{V}(\hat{p}_1-\hat{p}_2) = \frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}\)
    • Ensures the variance captures the sampling variability of each proportion.
  • Bound on the Error of Estimation:
    • \(B = Z_{1-\frac{\alpha}{2}} \cdot \sqrt{\hat{V}(\hat{p}_1-\hat{p}_2)}\)
    • This bound helps gauge the confidence interval for the difference between the two proportions, facilitating decision making.

Estimation of \(p_1-p_2\) for Dependent Samples

  • Estimator for Difference Between Two Proportions:
    • \(\hat{p}_1-\hat{p}_2\), which estimates the difference between two correlated proportions.
  • Estimated Variance of the Difference:
    • \(\hat{V}(\hat{p}_1-\hat{p}_2) = \frac{\hat{p}_1(1-\hat{p}_1)}{n} + \frac{\hat{p}_2(1-\hat{p}_2)}{n} + 2 \frac{\hat{p}_1 \hat{p}_2}{n}\)
    • This variance calculation includes a term for the covariance between \(\hat{p}_1\) and \(\hat{p}_2\), reflecting their dependence.
  • Bound on the Error of Estimation:
    • \(B = Z_{1-\frac{\alpha}{2}} \cdot \sqrt{\hat{V}(\hat{p}_1-\hat{p}_2)}\)
    • This bound provides a measure of uncertainty around the estimated difference, aiding in robust decision making under uncertainty.

Example 2: The notion of banning smoking from the workplace has been around for a long time.A Time/Yankelovich poll of 800 adult Americans carried out on April 6, 1994 (see Time, April 18, 1994) asked:

Should smoking be banned from workplaces, should there be special smoking areas, or should there be no restrictions?


The results are given in the following table:

Nonsmokers Smokers
Banned \(44 \%\) \(8 \%\)
Special areas \(52 \%\) \(80 \%\)
No restrictions \(3 \%\) \(11 \%\)

Based on a sample of approximately 600 non-smokers and 200 smokers, estimate

  1. the true difference between the proportions choosing “banned”
  1. the true difference between the proportions of nonsmokers choosing “banned” and “special areas”.