set.seed(1)
norm = rnorm(10000000, 10, 50) # "population" which is unknown
norm1 = sample(norm, 1000, replace = FALSE) # Random sample
norm2 = replicate(10000, {
x = sample(norm1, 1000, replace = TRUE)
sd(x)
})
mean(norm2)
返回平均 SD 49.91。我认为引导程序会返回比样本更接近总体的估计值。我做错了什么吗?
您计算的是样本的 sd(即 50(,而不是引导 sd。
set.seed(1)
norm <- rnorm(10000000, 10, 50) # population which is unknown
# This line is not needed
# norm1 = sample(norm, 1000, replace = FALSE) # Random sample
norm2 <- replicate(10000, {
x <- sample(norm, 1000, replace = TRUE) # Take a sample of norm (not of norm1)
x # Don't calculate the sd here
})
# Calculate the mean of each sample
sample_means <- apply(norm2, 2, mean)
# Calculate the sd of the sample means
sd(sample_means)