Day 33

MATH 313: Survey Design and Sampling

Introduction to Two-Stage Cluster Sampling

  • A method to handle large populations efficiently.
  • Involves two phases of random sampling:
    1. Selecting clusters from a population.
    2. Sampling within these selected clusters.

Case Study Setup

Sampling Strategy

  • Population: \(N = 12,000\) students (Total population elements).
  • Clusters: Students grouped by their classroom during a prime class hour.
  • Sampled Units: \(m = 4\) randomly selected classes (Clusters in the sample).

Sampling Objective

  • Estimate the average monthly entertainment expenditure per student.
  • Use class as a natural cluster for sampling.

How to Conduct Two-Stage Cluster Sampling

Steps Involved

  1. Selecting Clusters: Randomly select \(m\) clusters (e.g., classes).
  2. Sampling Within Clusters: Randomly select individuals within these clusters.

Practical Example

  • First Stage: Choose 4 classes from a schedule.
  • Second Stage: Interview about 10% of students in each selected class.

Analyzing Data from Two-Stage Cluster Sampling

  • Estimator: Calculate sample means within each cluster.
  • Aggregation: Combine these means to estimate the population mean.

\[ \hat{\mu} = \left(\frac{M}{\bar{N}}\right) \frac{\sum_{i=1}^{m} n_i \bar{y}_i}{m} \]

  • \(\hat{\mu}\): Estimated population mean.
  • \(M\): Total clusters in the population.
  • \(n_i\): Elements in cluster \(i\).
  • \(\bar{y}_i\): Sample mean from cluster \(i\).
  • \(m\): Number of sampled clusters.
  • \(\bar{N} = \frac{N}{M}\): Mean cluster size in the population.

Introduction to Population Estimation


Understanding population size is crucial in fields like ecology, event management, and quality control. This chapter explores various methods for estimating population sizes across different scenarios.

Direct Sampling Method

Direct sampling involves tagging and recapturing individuals from a wildlife population to estimate the total population size using the formula: \[ \hat{N} = \frac{t}{\hat{p}} \] where \(t\) is the number of tagged individuals and \(\hat{p}\) is the proportion of tagged individuals observed in the second sample. This method is foundational in ecological studies where precise population counts are needed.

Direct Sampling Method

Inverse Sampling Method

Inverse sampling is similar to direct sampling but does not predetermine the size of the second sample. Instead, sampling continues until a preset number of tagged individuals is recaptured. This method provides an alternative and often more flexible approach to estimating population size: \[ \hat{N} = \frac{t}{\hat{p}} \]

Inverse Sampling Method

Density Estimation from Quadrat Samples

When individuals are not mobile or are evenly distributed, density estimation via quadrat samples becomes effective. By randomly selecting plots and counting individuals within them, density per unit area is estimated and scaled up to the entire area: \[ \hat{\lambda} = \frac{\sum_{i=1}^{m} n_i}{m \cdot a}, \quad \hat{N} = \hat{\lambda} \times A \] where \(\hat{\lambda}\) is the estimated density, \(m\) is the number of sampled plots, \(n_i\) is the count in plot \(i\), and \(A\) is the total area.

Density Estimation from Quadrat Samples

Density Estimation from Stocked Quadrats

This method adapts the traditional density estimation by focusing on the presence or absence of the species rather than exact counts. Useful in scenarios where counting exact numbers is impractical: \[ \hat{\lambda} = -\frac{1}{a} \ln \left(\frac{y}{m}\right), \quad \hat{N} = \hat{\lambda} \times A \] where \(y\) is the number of unstocked quadrats, and \(a\) is the area of each quadrat.

Density Estimation from Stocked Quadrats

Adaptive Sampling

Adaptive sampling involves adjusting the sampling strategy based on initial findings. This technique is especially beneficial in heterogeneous environments where certain areas may exhibit significantly higher density than others: \[ \hat{N} = \hat{\mu} \times M, \quad \hat{\mu} = \frac{1}{m} \sum_{i=1}^{m} \frac{y_i}{m_i} \] where \(m_i\) is the number of additional cells sampled around the initial cell if a species is found, enhancing the estimate’s accuracy.

Adaptive Sampling

Example 1: Wildlife Conservation

To estimate the population of a bird species, a conservation team uses direct sampling by tagging and recapturing birds. The initial sample sizes and recapture rates provide the data needed to estimate the total population.

Wildlife Conservation: Using direct sampling, the team estimates a bird species’ population at 1200 based on recapture rates of tagged birds.

Example 2: Event Attendance

An event organizer estimates attendee numbers using adaptive sampling techniques by initially counting in a small area and adjusting counts based on crowd density observed in various venue areas.

Event Attendance: By employing adaptive sampling at a music festival, organizers estimate 25,000 attendees, adjusting their counts based on varying crowd densities across different sections of the venue.