Sampling & Central Limit Theorem | CFA Level I Quantitative Methods
Importance of Sampling and Estimation
While we often apply descriptive statistics to understand the entire population, studying the whole population may not always be practical or possible. In such cases, we select a sample of the population to obtain statistics, which serve as estimations of the true population parameters.
Simple Random Sampling
One way to fairly sample from a population is through simple random sampling. The goal is to ensure that each element has an equal probability of being selected. For example, an analyst could use a random number generator to select accounts from a large wealth management firm’s database.
3. Systematic Sampling
In some cases, a more systematic approach may be desired. One method is systematic sampling, which involves selecting every nth element from the population. For instance, an analyst could select 10% of the accounts by choosing accounts with the same last digit of the account number.
4. Stratified Random Sampling
When simple random sampling isn’t ideal, stratified random sampling might be a better option. The population is divided into subgroups based on one or more distinguishing characteristics, and random samples are taken from each subgroup in sizes proportional to that of the population.
A firm has 100 accounts with 3 different investment styles for its accounts: 20% low risk, 50% moderate risk, and 30% high risk. The analyst wants the sample to reflect the distribution of the investment styles. How can they do this?
Using stratified random sampling, the analyst would take 2 samples from the low-risk group, 5 samples from the moderate-risk group, and 3 samples from the high-risk group. This ensures the sample reflects the distribution of the investment styles of the entire population.
Cluster sampling is based on subsets of a population, assuming each cluster is representative of the overall population. Clusters could be accounts managed by sub-branches or grouped by geographic location. There are two types of cluster sampling:
- One-stage cluster sampling: Selecting a number of clusters and using all observations in those clusters as the sample.
- Two-stage cluster sampling: Randomly selecting a subset of observations from the chosen clusters.
While cluster sampling can be more time- and cost-efficient, it may yield lower accuracy compared to other sampling methods.
Probability vs. Non-Probability Sampling
Probability sampling gives each population member an equal chance of being selected, creating a representative sample. In contrast, non-probability sampling relies on factors other than probability, such as cost and ease of access, or the researcher’s subjective judgment. Non-probability sampling methods include convenience sampling and judgmental sampling.
Convenience sampling involves selecting samples based on the data’s accessibility. While data can be collected quickly at a low cost, the sample may not be representative, limiting sampling accuracy.
Judgmental sampling involves handpicking samples based on a researcher’s knowledge and professional judgment. This method allows researchers to target specific populations but may be affected by researcher bias, leading to skewed results.
When auditing financial statements, seasoned auditors use judgmental sampling to select accounts or transactions that provide sufficient audit coverage. Why might they choose this method?
Judgmental sampling is suitable in time-sensitive situations or when the researcher’s expertise is crucial. In this case, auditors can apply their judgment to efficiently examine accounts and transactions that are most relevant to the audit.
Remember: Although non-probability sampling methods can be more cost-effective and efficient, probability sampling typically yields more accurate and reliable results. Understanding the trade-offs between these methods is essential for effective statistical analysis in the world of finance.
Sampling Error and Sampling Distribution
When we sample from a population, we attempt to estimate the true population mean (μ) and standard deviation (σ) by calculating the sample mean (x̄) and standard deviation (s). The difference between the sample statistic and population parameter is the sampling error.
Central Limit Theorem
The Central Limit Theorem states that for simple random samples of size n from a population with mean μ and variance σ², the sampling distribution of the sample mean approaches a normal distribution with mean μ and variance σ²/n. As we increase n, the variance of the sampling distribution gets smaller, making the distribution narrower and our estimates more accurate.
Applying the Central Limit Theorem
A stock has an average daily return of 0.18% and a standard deviation of returns of 0.95%. An analyst takes a sample of 30 random observations. What is the mean and standard deviation of the sample distribution?
Using the Central Limit Theorem:
An analyst takes 100 random samples of daily returns for another stock, finding a sample mean of 0.23% and a sample standard deviation of 1.19%. Calculate and interpret the mean and standard deviation of the sample distribution.
Mean of the sample distribution: 0.23%
Standard deviation of the sample distribution: 1.19% / √100 = 0.12%
This means that if we took all possible combinations of samples of size 100, the mean of the sample returns would be 0.23%, and the standard deviation would be 0.12%.
Remember these three key points regarding CLT:
- The sampling distribution will be approximately normal when the sample size is at least 30.
- The mean of the sampling distribution is equal to the mean of the population.
- The variance of the sampling distribution is equal to the population variance divided by the sample size.
That’s it for this lesson! In the next lesson, we’ll tackle the estimation problem. See you there!