Before we discuss about Bootstrap Sample, read about Sampling With Replacement and Sampling Without Replacement
A bootstrap sample is a random sample that is performed with replacement. Bootstrapping is a resampling with replacement which uses sampling with replacement, It will generate N number of samples and each sample is the same size of population.
Let’s say we have population as {1,3,6,2,9} and now if I want to make one sample out the population It will be {1,9} and now we need to replace the same values 1,9 which is in the sample in order to reach the size of population so our final sample will be like this {1,9,9,9,1,1} and this we call it bootstrap sampling.
Bootstrap Percentile Method.
This is the process to calculate the confidence intervals for bootstrap sampling.
Below are the four bootstrap sample for the population {1,3,6,2,9} and its corresponding means. Population means is 4.2
- {1,9,9,1,1} = 4.2
- {3,6,3,3,3} = 3.6
- {2,1,2,2,1} = 1.6
- {2,6,6,6,2} = 4.4
Now arrange the means from lowest to highest values like {1.6,3.6,4.2,4.4}, Now if i want to find the 90% confidence interval, which means 100-90=10%, so we need to find the value 5% from lower side and 5% from upper side same way if we want to find confidence interval for 95% it will be 100-95=5%, so we need to find the value 2.5% from the lower side and 2.5% from upper side.In our case for 90% Confidence Interval for the bootstrap sample means will be 3.6 to 4.2
Note:
Typically about 1/3 or 33% of the original data does not end up in the bootstrap dataset, means each bootstrap sample will have only 73% of the original data available and the rest 33% won’t be available, the remaining 33% will be the data with replacement.