Day 36

MATH 313: Survey Design and Sampling

Direct vs. Inverse Sampling Techniques

Key Differences

  • Direct Sampling: Tag animals, recapture randomly, and count recaptures to estimate population size. The number of recaptures isn’t fixed and depends on the sample size.
  • Inverse Sampling: Continue sampling until a fixed number of previously tagged animals are recaptured. This method can yield more precise information than direct sampling if the second sample size \(n\) is appropriately small relative to the total population \(N\).

Considerations

  • Inverse sampling can require a disproportionately large \(n\) if the initial sample size \(t\) is poorly chosen and nothing is known about \(N\).

Mathematical Model

\[ p_1=\frac{t}{N} \quad \text{and} \quad p_2=\frac{n}{N} \]

  • \(p_1\) (First Sampling Fraction): Represents the proportion of the total population \(N\) sampled initially.
  • \(p_2\) (Second Sampling Fraction): Signifies the proportion of the total population sampled during the recapture phase. It is critical for adjusting the precision of the population estimate.

\[ \frac{V(\hat{N})}{N} \approx \frac{1-p_1}{p_1 p_2} \]

  • This formula shows how the variance of the estimated population size \(\hat{N}\) relative to the population size \(N\) depends inversely on both \(p_1\) and \(p_2\).

Optimizing Sampling Fractions

Example Scenario

  • Context: Estimating deer population which is similar in size to the previous year (800 to 1000 deer).
  • Objective: Minimize total sample size while bounding the error within 200.

Strategy

  • Choose \(p_1\) and \(p_2\) to minimize total effort for a given precision.
  • Graphical tools help visualize and select optimal sampling fractions.

Optimal Fractions

  • Example 10.3a: \(p_1 = 31.5\%\), \(p_2 = 21.7\%\) yields a standard deviation of ~100.
  • Example 10.3b: Similar approach with a targeted number of recaptures leading to nearly identical \(p_2\).

Advanced Techniques and Multistage Sampling

Extending Basic Techniques

  • Beyond two stages: Tag additional untagged animals in subsequent samples.
  • Multistage approaches improve estimator accuracy for \(N\) and are useful in continuous studies.

Considerations

  • Adjust for demographic changes and varying capture probabilities.
  • See Seber (1982, 1986) for advanced methods.

Visual Tool

  • Our interactive Shiny app simplifies these complex calculations and allows for dynamic adjustment and visualization of \(p_1\) and \(p_2\) interaction.

Introduction to Quadrat Sampling

Quadrat sampling is a method used to estimate the number of elements within a fixed location. It calculates density per unit area and extrapolates this to estimate the total population size.

  • Setup: A total area \(A\) is divided into \(m\) plots, each of area \(a\). The total number of plots in the region is \(M\), so \(A = M \cdot a\).
  • Data Collection: Count the number of individuals in each randomly selected plot or quadrat.

Estimating Density and Population Size

  • Density Estimation: \[ \hat{\lambda} = \frac{\bar{n}}{a} \] where \(\bar{n}\) is the average count per plot from the sampled quadrats.
  • Population Size Estimation: \[ \hat{N} = \hat{\lambda} \cdot A \]

Variance and Confidence Interval

  • Variance of the Density Estimator: \[ \hat{V}(\hat{\lambda})=\frac{1}{a^2} \cdot \frac{S_n^2}{m} \] where \(S_n^2=\frac{\sum_{i=1}^m(n_i-\bar{n})^2}{m-1}\)

  • Population Size Variance: \[ \hat{V}(\hat{N})=A^2 \cdot \hat{V}(\hat{\lambda}) \]

  • Confidence Interval: \[ \hat{N} \pm B \] where \(B = t_{\frac{\alpha}{2}, m-1} \cdot \sqrt{\hat{V}(\hat{N})}\) is the bound on the error of estimation.

Example 1: Estimating Fire Ant Density

In Florida, a study to estimate the density of fire ant hills used 50 quadrats, each of 16 \(m^2\). The counts of ant hills per quadrat varied, allowing for an estimate of ant hill density per unit area.

Number of hills Frequency
0 13
1 8
2 12
3 10
4 5
5 2
50

Example 2: Estimating Disease Prevalence in Trees

The density of trees having fusiform rust on a southern-pine plantation of 200 acres is to be estimated from a sample of \(m=10\) quadrats of 0.5 acre each. The ten sampled plots had an average of 2.8 infected trees per quadrat.

  1. Estimate the density of infected trees and place a \(95 \%\) bound on the error of estimation.
  1. Estimate the total number of infected trees and place a \(95 \%\) bound on the error of estimation.