Simple random sampling
The fairest way to pick a sample, explained with a worked example, the formulas, and a free draw tool you can use right now.
Quick answer
Simple random sampling is a method where every member of a population has an equal, independent chance of being chosen, and every possible sample of size n is equally likely. You assign each unit a number, then use a random number generator or table to pick the sample. It removes selection bias and lets you generalise to the whole population.
- Each unit's chance of selection equals n / N (sample size divided by population size).
- Best when you have a complete list of the population (a sampling frame).
- Almost always done without replacement so no unit is picked twice.
If you have ever drawn names from a hat, you already understand the core idea. Every name has the same shot at being pulled, and what comes out is a fair slice of what went in. That is the whole promise of simple random sampling, and it is why it sits underneath almost every other sampling method you will meet in a methods course or a journal article.
I learned to respect it the hard way. On my first field season studying sloth bear sign, I picked survey plots that "looked promising" and ended up with a dataset that flattered my hypothesis and convinced no reviewer. The fix was boring and powerful: number every plot, let a random draw choose them, and stop arguing with my own bias. This guide walks through what the method is, when to use it, the exact formulas, two fully worked examples, and a small tool that draws a reproducible sample for you.
What is simple random sampling?
Simple random sampling is a probability sampling technique in which every unit in a population has an equal and independent chance of being selected, so each possible sample of a given size is equally likely. It is the baseline against which other designs are measured.
Two conditions have to hold for a draw to count as a true simple random sample. First, every unit must have the same selection probability. Second, the selections must not depend on one another beyond the no-repeat rule. When both hold, your sample is, on average, a miniature of the population, and the maths of confidence intervals and significance tests applies cleanly. The unbiased estimators behind that maths trace back to classic work on selection without replacement from a finite universe, and the same logic underpins almost every survey design used today.
The opposite of this is convenience sampling, where you grab whoever or whatever is easiest. That is fast and almost always biased. The simple random sampling definition in statistics is built precisely to rule that bias out.
When to use simple random sampling (and when not to)
Use simple random sampling when you have a complete list of your population and no strong reason to treat any subgroup differently. It is the right default for homogeneous populations and small to medium studies.
It starts to creak in three situations. If the population is spread over a huge area, a random draw scatters your sites and travel eats your budget. If an important subgroup is rare, a pure random draw might miss it entirely. And if there is no list to draw from, you cannot do it at all. Knowing when to use simple random sampling means knowing these limits. Here is how it compares to the common alternatives.
| Method | How it works | Best when | Main weakness |
|---|---|---|---|
| Simple random | Number every unit, draw at random | You have a full list and a fairly uniform population | Can miss small subgroups; scatters field sites |
| Systematic | Pick every k-th unit after a random start | Units are in a list or a line (transects, files) | Bias if the list has a hidden cycle |
| Stratified | Split into strata, sample within each | Clear subgroups you want represented | Needs strata info up front |
| Cluster | Randomly pick whole groups, then sample inside | Population is geographically clumped | Less precise per unit sampled |
| Convenience | Take whatever is easiest | Pilots and quick checks only | Biased; cannot generalise |
When you compare simple random sampling vs systematic sampling, the deciding question is usually whether your list has a repeating pattern. If it does, systematic sampling can lock onto that cycle and mislead you. When subgroups matter, jump to stratified sampling instead. National statistics offices document these designs in their methodology guides, including Statistics Canada, the Australian Bureau of Statistics, and the UK's Office for National Statistics.
How to do simple random sampling: step by step
To select a simple random sample, build a complete list, number it, generate random numbers, and pick the matching units without repeats. Five steps cover it.
- Define the population. Write down exactly who or what is eligible. A vague population gives a vague result.
- Build the sampling frame. Make a complete list of every unit. This list is the part people skip, and it is where most studies go wrong.
- Number every unit. Assign 1 to N, with no gaps and no duplicates.
- Decide the sample size n. Use a power or precision calculation, not a guess — standard sample-size formulas or your stats software handle this in seconds.
- Draw the numbers at random. Use a random number generator, a random number table, or the tool further down this page. Discard any number outside 1 to N and any repeat, until you have n unique units.
That is the entire simple random sampling method. The discipline is in steps 1 and 2; the randomness in step 5 is the easy part once a computer does it.
The formula and method, in plain English
The core simple random sampling formula is the selection probability: every unit is chosen with probability n / N. From that single fact, the rest follows.
Selection probability and sampling fraction
Where: n = sample size, N = population size, f = the sampling fraction (the share of the population you measured). Inclusion probability for a simple random sample without replacement is the same n / N for every unit.
The number of distinct samples you could possibly draw without replacement is the binomial coefficient C(N, n) = N! / (n!(N − n)!), and a true simple random sample gives each of them an equal chance.
Estimating a mean, and the finite population correction
The sample mean is an unbiased estimate of the population mean. Its standard error shrinks as n grows, and shrinks further when your sample is a large slice of a small population. That second effect is the finite population correction (FPC), and it is the piece most quick guides leave out.
Where: x̄ = sample mean, s = sample standard deviation, and the term √(1 − n/N) is the finite population correction. Drop the FPC when the sampling fraction is below about 5%, since it barely moves the answer.
Working out the sample size
For estimating a proportion, the classic Cochran-style formula gives a starting size, which you then shrink for a finite population.
Where: z = 1.96 for 95% confidence, p = expected proportion (use 0.5 for the safest, largest estimate), e = margin of error. For p = 0.5, e = 0.05, you get n₀ ≈ 385. For a population of N = 2,000 that drops to about 322. Good practical guidelines for effective sample size determination and a simplified guide to choosing a sampling technique and sample size both walk through this trade-off in more depth.
Worked examples you can follow by hand
Two examples, both fully solved. The first draws a sample; the second uses one to estimate a number with a confidence interval.
Example 1: a simple random sampling example with a random number table
I have 20 vegetation plots numbered 01 to 20 and want a sample of n = 5. The sampling fraction is f = 5 / 20 = 0.25, so every plot has a 25% chance of selection.
Reading two-digit numbers off a random number stream, I get: 14, 02, 88, 14, 09, 20, 88, 11. I keep numbers from 01 to 20, skip anything larger, and skip any repeat.
- 14 → keep
- 02 → keep
- 88 → skip (above 20)
- 14 → skip (already chosen)
- 09 → keep
- 20 → keep
- 88 → skip; 11 → keep
Selected plots: 02, 09, 11, 14, 20. Five unique units, each with inclusion probability 0.25. That is a complete simple random sample, drawn without replacement.
Example 2: estimating a mean with the finite population correction
A clinic has N = 1,200 patients on file. I draw a simple random sample of n = 60 and record waiting time. The sample mean is x̄ = 38 minutes with a standard deviation of s = 12 minutes.
The sampling fraction is 60 / 1,200 = 0.05, right at the edge where the FPC matters, so I will keep it.
Standard error: SE = (12 / √60) × √(1 − 0.05) = 1.549 × 0.9747 = 1.51 minutes.
95% confidence interval: 38 ± (1.96 × 1.51) = 38 ± 2.96 = (35.0, 41.0) minutes.
So from 60 patients I can say, with 95% confidence, that the average wait across all 1,200 sits between roughly 35 and 41 minutes. You can reproduce an interval like this with the standard t or z formula in any statistics package.
Planning tools and data collection
Drawing the sample by hand is fine for a teaching example and painful for real work. In practice I generate the draw in R with sample() or in Python with random.sample(), both of which do simple random sampling without replacement by default. The tool below does the same thing in your browser, using a seeded pseudo-random number generator in the same family as the widely used Mersenne Twister algorithm, so the draw is reproducible.
Free simple random sample calculator
How to use this tool: set your sample size, paste your population (or type a single number to use IDs 1…N), then press Draw random sample. Nothing runs until you click.
Reproducibility matters more than people expect. By recording the seed, you (or a reviewer) can regenerate the exact same sample months later. That single habit has saved me from more than one awkward email.
Sample results: what the output looks like
Run the tool with the defaults (n = 8, seed = 2026, the 40-unit population) and you get a small reproducible draw. A typical result looks like this:
| Quantity | Value |
|---|---|
| Population size (N) | 40 |
| Sample size (n) | 8 |
| Sampling fraction (f = n/N) | 0.20 (20%) |
| Inclusion probability per unit | 0.20 |
| Draw method | Without replacement |
| Selected units (seed 2026) | regenerated identically every run |
The key point: the same seed always returns the same eight units, while a blank seed gives you a fresh draw each time. Twenty percent is a heavy sampling fraction, so in a real study the FPC would meaningfully tighten your standard errors.
How to report simple random sampling in a paper
Reviewers want three things in your methods section: the population and frame, the sample size with its justification, and the exact draw mechanism including the seed. State them plainly. Here are templates you can adapt.
Methods section (APA-style)
sample(), seed = 2026). Each patient therefore had a selection probability of 0.05."Thesis or dissertation phrasing
Plain-language summary
Always report the seed or the random number source. It is the difference between a method a reader can reproduce and a claim they have to take on faith.
Common mistakes and how to avoid them
Most failures in simple random sampling happen before any number is drawn. Watch for these six.
- An incomplete sampling frame. If units are missing from your list, they have zero chance of selection, and your "random" sample is quietly biased. Fix the frame first — design-based inference frameworks for ecological data spell out exactly why this assumption matters.
- Confusing random with haphazard. Picking units that feel arbitrary is not random. Only a generator or a random number table delivers true randomness.
- Sampling with replacement by accident. Real studies almost always want without replacement. Check the default in whatever tool you use.
- Ignoring the finite population correction. When the sampling fraction tops 5%, leaving out the FPC overstates your uncertainty. Newer finite population variance estimators continue to refine the classic FPC for special cases.
- Ignoring non-response. If part of your sample doesn't respond, a textbook simple random sample can quietly turn into a biased one. Adjusted estimators exist for exactly this case, but only if you flag the problem first.
- Not recording the seed. Without it, you cannot reproduce the draw, and reproducibility is the whole point of a probability method. In wildlife surveys, where designs are often updated from one survey to the next, a logged seed is what lets the next field season pick up where the last one left off.
Conclusion: why this method still earns its place
Simple random sampling is old, plain, and quietly powerful. It is the method that makes "we can generalise to the population" an honest statement rather than a hopeful one. Every fancier design, stratified, cluster, multistage, is built on top of the same equal-chance logic you have just seen.
The advantages and disadvantages of simple random sampling come down to a trade. You gain freedom from selection bias and a clean path to confidence intervals and tests. You give up some efficiency when populations are spread out or when rare subgroups need protecting. For most well-listed populations, that trade is a clear win.
Here is what I would hold onto:
- The selection probability n / N is the engine of the whole method, so know your N and your n exactly.
- Spend your effort on a complete sampling frame; the random draw itself is trivial once a tool does it.
- Use a seed, report it, and apply the finite population correction whenever your sampling fraction climbs above 5%.
- If subgroups or geography dominate your problem, reach for stratified or cluster sampling instead.
Draw a practice sample in the tool above, then summarise your real data with a descriptive tool like the median calculator, the mode calculator, or the mean absolute deviation calculator. Once you have a feel for the method, the same equal-chance logic carries straight over to stratified, cluster, and multi-stage designs.
Frequently asked questions
Q1. What is simple random sampling in simple terms?
It is picking a sample so that every member of the population has the same chance of being chosen, like drawing names from a hat. Because nothing about a unit makes it more or less likely to be picked, the sample fairly represents the whole population and you can generalise your results to it.
Q2. How do you select a simple random sample step by step?
Step 1: list every unit in the population. Step 2: number them 1 to N. Step 3: choose your sample size n. Step 4: use a random number generator or table to pick n unique numbers, skipping anything out of range or already chosen. Step 5: match the numbers to units. The browser tool on this page does steps 2 to 4 for you.
Q3. What is the formula for simple random sampling?
The core formula is the selection probability, n / N, where n is the sample size and N is the population size. The sampling fraction f equals n / N as well. For estimating a mean, the standard error is (s / √n) × √(1 − n/N), where the last term is the finite population correction.
Q4. What is the difference between simple random sampling with and without replacement?
Without replacement, a unit can be chosen only once, which is the standard for real studies. With replacement, the same unit can be selected more than once, since each draw is independent. Without replacement gives a smaller standard error because of the finite population correction; with replacement removes that correction.
Q5. What is the difference between simple random and systematic sampling?
Simple random sampling draws each unit independently using random numbers, so there is no pattern. Systematic sampling picks every k-th unit after a random start, which is quicker to administer but can be biased if the list contains a repeating cycle that lines up with the interval k. With a well-shuffled list, the two give similar results.
Q6. When should you not use simple random sampling?
Avoid it when you lack a complete list of the population, when units are spread across a large area and travel is costly, or when an important subgroup is rare and a random draw might miss it. In those cases stratified, cluster, or multistage sampling usually performs better.
Q7. What sample size do I need for a simple random sample?
It depends on your margin of error, confidence level, and population size. For a proportion at 95% confidence and a 5% margin, a Cochran-style formula gives about 385, which shrinks for smaller populations. A 2,000-person population needs roughly 322. Use a sample size calculator to plug in your own numbers.
Q8. How do I do simple random sampling in Excel or R?
In Excel: put your list in a column, add a helper column with =RAND(), sort by it, and take the top n rows. In R: use sample(population, n) for a draw without replacement. In Python: use random.sample(population, n). Set a seed first (set.seed() in R) so the draw is reproducible.
Q9. What are the advantages and disadvantages of simple random sampling?
Advantages: it removes selection bias, is easy to explain, and produces unbiased estimates with clean confidence intervals and tests. Disadvantages: it needs a complete list of the population, can scatter field sites across a wide area, and may under-represent small subgroups. When those drawbacks bite, stratified or cluster sampling helps.
Q10. Is simple random sampling still used in research today?
Yes. It remains the default probability method in surveys, clinical research, ecology, and quality control, and it is the foundation for the maths behind most statistical tests. Larger national surveys often combine it with stratification or clustering, but the equal-chance principle of simple random sampling sits at the centre of those designs.
Q11. What is the difference between simple random sampling and stratified sampling?
Simple random sampling draws straight from the whole population with no groups. Stratified sampling first splits the population into groups (strata) that share a key trait, such as age band or habitat type, then draws a simple random sample within each group. Stratified sampling usually gives more precise estimates when those groups differ a lot from each other, at the cost of needing more information about the population upfront.
Q12. How is simple random sampling used in research methodology?
In a methods section, simple random sampling is the step that turns a list of potential participants, sites, or records (the sampling frame) into the actual sample you study. Researchers describe the frame, the sample size and how it was justified, and the exact draw method, including the seed if a computer generator was used. This lets reviewers and other researchers judge how well the sample represents the population and reproduce the selection if needed.
References
The following peer-reviewed papers support the methods covered in this guide, spanning the theory of probability sampling, the finite population correction, sample size determination, and applications of simple random sampling in survey and ecological research.









