Simple Random Sampling: Formula, Example & Free Tool

Simple random sampling

The fairest way to pick a sample, explained with a worked example, the formulas, and a free draw tool you can use right now.

Quick answer

Simple random sampling is a method where every member of a population has an equal, independent chance of being chosen, and every possible sample of size n is equally likely. You assign each unit a number, then use a random number generator or table to pick the sample. It removes selection bias and lets you generalise to the whole population.

  • Each unit's chance of selection equals n / N (sample size divided by population size).
  • Best when you have a complete list of the population (a sampling frame).
  • Almost always done without replacement so no unit is picked twice.

If you have ever drawn names from a hat, you already understand the core idea. Every name has the same shot at being pulled, and what comes out is a fair slice of what went in. That is the whole promise of simple random sampling, and it is why it sits underneath almost every other sampling method you will meet in a methods course or a journal article.

I learned to respect it the hard way. On my first field season studying sloth bear sign, I picked survey plots that "looked promising" and ended up with a dataset that flattered my hypothesis and convinced no reviewer. The fix was boring and powerful: number every plot, let a random draw choose them, and stop arguing with my own bias. This guide walks through what the method is, when to use it, the exact formulas, two fully worked examples, and a small tool that draws a reproducible sample for you.

What is simple random sampling?

Simple random sampling is a probability sampling technique in which every unit in a population has an equal and independent chance of being selected, so each possible sample of a given size is equally likely. It is the baseline against which other designs are measured.

Two conditions have to hold for a draw to count as a true simple random sample. First, every unit must have the same selection probability. Second, the selections must not depend on one another beyond the no-repeat rule. When both hold, your sample is, on average, a miniature of the population, and the maths of confidence intervals and significance tests applies cleanly. The unbiased estimators behind that maths trace back to classic work on selection without replacement from a finite universe, and the same logic underpins almost every survey design used today.

The opposite of this is convenience sampling, where you grab whoever or whatever is easiest. That is fast and almost always biased. The simple random sampling definition in statistics is built precisely to rule that bias out.

A population frame of numbered units, a random draw, and the resulting unbiased sample Population (N units) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 random draw Sample (n units) 3 9 14 16 each unit chosen with probability n / N
Every numbered unit in the population frame has the same chance of being drawn into the sample.

When to use simple random sampling (and when not to)

Use simple random sampling when you have a complete list of your population and no strong reason to treat any subgroup differently. It is the right default for homogeneous populations and small to medium studies.

It starts to creak in three situations. If the population is spread over a huge area, a random draw scatters your sites and travel eats your budget. If an important subgroup is rare, a pure random draw might miss it entirely. And if there is no list to draw from, you cannot do it at all. Knowing when to use simple random sampling means knowing these limits. Here is how it compares to the common alternatives.

MethodHow it worksBest whenMain weakness
Simple randomNumber every unit, draw at randomYou have a full list and a fairly uniform populationCan miss small subgroups; scatters field sites
SystematicPick every k-th unit after a random startUnits are in a list or a line (transects, files)Bias if the list has a hidden cycle
StratifiedSplit into strata, sample within eachClear subgroups you want representedNeeds strata info up front
ClusterRandomly pick whole groups, then sample insidePopulation is geographically clumpedLess precise per unit sampled
ConvenienceTake whatever is easiestPilots and quick checks onlyBiased; cannot generalise

When you compare simple random sampling vs systematic sampling, the deciding question is usually whether your list has a repeating pattern. If it does, systematic sampling can lock onto that cycle and mislead you. When subgroups matter, jump to stratified sampling instead. National statistics offices document these designs in their methodology guides, including Statistics Canada, the Australian Bureau of Statistics, and the UK's Office for National Statistics.

Four sampling designs compared: simple random scatters picks across the whole list, systematic takes every k-th unit, stratified samples within subgroups, and cluster selects whole groups Simple random picks scattered across the whole list Systematic every k-th unit after a random start Stratified sample within each subgroup Cluster choose whole groups at random Four ways to choose a sample (green = selected)
Simple random sampling spreads picks across the entire list; the other designs add structure for efficiency or coverage.

How to do simple random sampling: step by step

To select a simple random sample, build a complete list, number it, generate random numbers, and pick the matching units without repeats. Five steps cover it.

  1. Define the population. Write down exactly who or what is eligible. A vague population gives a vague result.
  2. Build the sampling frame. Make a complete list of every unit. This list is the part people skip, and it is where most studies go wrong.
  3. Number every unit. Assign 1 to N, with no gaps and no duplicates.
  4. Decide the sample size n. Use a power or precision calculation, not a guess — standard sample-size formulas or your stats software handle this in seconds.
  5. Draw the numbers at random. Use a random number generator, a random number table, or the tool further down this page. Discard any number outside 1 to N and any repeat, until you have n unique units.
The five steps of simple random sampling: define the population, build the sampling frame, number every unit, set the sample size, then draw units at random From population to sample in five steps 1 Define the population 2 Build the sampling frame 3 Number units 1 to N 4 Set sample size n 5 Draw at random
The hard work is steps 1 and 2 (a complete list); the random draw in step 5 is the easy part once a tool does it.

That is the entire simple random sampling method. The discipline is in steps 1 and 2; the randomness in step 5 is the easy part once a computer does it.

The formula and method, in plain English

The core simple random sampling formula is the selection probability: every unit is chosen with probability n / N. From that single fact, the rest follows.

Selection probability and sampling fraction

P(unit i is selected) = n / N     and     f = n / N

Where: n = sample size, N = population size, f = the sampling fraction (the share of the population you measured). Inclusion probability for a simple random sample without replacement is the same n / N for every unit.

The number of distinct samples you could possibly draw without replacement is the binomial coefficient C(N, n) = N! / (n!(N − n)!), and a true simple random sample gives each of them an equal chance.

Without replacement, a drawn unit is removed and cannot be chosen again; with replacement, the unit is returned and can be chosen more than once Without replacement the standard for real studies 1 2 3 4 5 6 drawn units removed draw Sample 35 Each unit can be chosen once. With replacement units return to the pool after each draw 1 2 3 4 5 6 nothing is removed draw Sample 353 A unit can be chosen again.
Real studies almost always sample without replacement, which is also what earns the finite population correction below.

Estimating a mean, and the finite population correction

The sample mean is an unbiased estimate of the population mean. Its standard error shrinks as n grows, and shrinks further when your sample is a large slice of a small population. That second effect is the finite population correction (FPC), and it is the piece most quick guides leave out.

x̄ = (1/n) Σ xᵢ      SE(x̄) = (s / √n) × √(1 − n/N)

Where: x̄ = sample mean, s = sample standard deviation, and the term √(1 − n/N) is the finite population correction. Drop the FPC when the sampling fraction is below about 5%, since it barely moves the answer.

When the sampling fraction is tiny the finite population correction is near 1 and can be dropped; when the fraction is large it noticeably shrinks the standard error Finite population correction = √(1 − n/N) Large population, small sample n = 60 of N = 10,000 f = 0.6% FPC ≈ 1.00 Safe to drop — barely changes SE Small population, large sample n = 60 of N = 120 f = 50% FPC = 0.71 Keep it — cuts SE by about 29%
Rule of thumb: apply the finite population correction once the sampling fraction f = n/N climbs above roughly 5%.

Working out the sample size

For estimating a proportion, the classic Cochran-style formula gives a starting size, which you then shrink for a finite population.

n₀ = z² · p(1 − p) / e²      n = n₀ / (1 + (n₀ − 1)/N)

Where: z = 1.96 for 95% confidence, p = expected proportion (use 0.5 for the safest, largest estimate), e = margin of error. For p = 0.5, e = 0.05, you get n₀ ≈ 385. For a population of N = 2,000 that drops to about 322. Good practical guidelines for effective sample size determination and a simplified guide to choosing a sampling technique and sample size both walk through this trade-off in more depth.

Worked examples you can follow by hand

Two examples, both fully solved. The first draws a sample; the second uses one to estimate a number with a confidence interval.

Example 1: a simple random sampling example with a random number table

I have 20 vegetation plots numbered 01 to 20 and want a sample of n = 5. The sampling fraction is f = 5 / 20 = 0.25, so every plot has a 25% chance of selection.

Reading two-digit numbers off a random number stream, I get: 14, 02, 88, 14, 09, 20, 88, 11. I keep numbers from 01 to 20, skip anything larger, and skip any repeat.

  • 14 → keep
  • 02 → keep
  • 88 → skip (above 20)
  • 14 → skip (already chosen)
  • 09 → keep
  • 20 → keep
  • 88 → skip; 11 → keep

Selected plots: 02, 09, 11, 14, 20. Five unique units, each with inclusion probability 0.25. That is a complete simple random sample, drawn without replacement.

Reading a random number stream for a population numbered 01 to 20: keep values in range, skip values above 20 and any repeats, until five unique units are chosen Population 01–20, pick 5: keep in-range, skip out-of-range and repeats 14 keep 02 keep 88 skip >20 14 skip repeat 09 keep 20 keep 88 skip >20 11 keep Sample: 02 09 11 14 20
The same logic the tool below automates: walk the random stream, drop anything out of range or already chosen, stop at five.

Example 2: estimating a mean with the finite population correction

A clinic has N = 1,200 patients on file. I draw a simple random sample of n = 60 and record waiting time. The sample mean is x̄ = 38 minutes with a standard deviation of s = 12 minutes.

The sampling fraction is 60 / 1,200 = 0.05, right at the edge where the FPC matters, so I will keep it.

Standard error: SE = (12 / √60) × √(1 − 0.05) = 1.549 × 0.9747 = 1.51 minutes.

95% confidence interval: 38 ± (1.96 × 1.51) = 38 ± 2.96 = (35.0, 41.0) minutes.

So from 60 patients I can say, with 95% confidence, that the average wait across all 1,200 sits between roughly 35 and 41 minutes. You can reproduce an interval like this with the standard t or z formula in any statistics package.

Planning tools and data collection

Drawing the sample by hand is fine for a teaching example and painful for real work. In practice I generate the draw in R with sample() or in Python with random.sample(), both of which do simple random sampling without replacement by default. The tool below does the same thing in your browser, using a seeded pseudo-random number generator in the same family as the widely used Mersenne Twister algorithm, so the draw is reproducible.

Free simple random sample calculator

How to use this tool: set your sample size, paste your population (or type a single number to use IDs 1…N), then press Draw random sample. Nothing runs until you click.

Your sample and its sampling fraction will appear here after you click Draw random sample.

Reproducibility matters more than people expect. By recording the seed, you (or a reviewer) can regenerate the exact same sample months later. That single habit has saved me from more than one awkward email.

Sample results: what the output looks like

Run the tool with the defaults (n = 8, seed = 2026, the 40-unit population) and you get a small reproducible draw. A typical result looks like this:

QuantityValue
Population size (N)40
Sample size (n)8
Sampling fraction (f = n/N)0.20 (20%)
Inclusion probability per unit0.20
Draw methodWithout replacement
Selected units (seed 2026)regenerated identically every run

The key point: the same seed always returns the same eight units, while a blank seed gives you a fresh draw each time. Twenty percent is a heavy sampling fraction, so in a real study the FPC would meaningfully tighten your standard errors.

How to report simple random sampling in a paper

Reviewers want three things in your methods section: the population and frame, the sample size with its justification, and the exact draw mechanism including the seed. State them plainly. Here are templates you can adapt.

Methods section (APA-style)

"Participants were selected by simple random sampling from a sampling frame of N = 1,200 registered patients. A sample of n = 60 (sampling fraction 5%) was drawn without replacement using R 4.4 (sample(), seed = 2026). Each patient therefore had a selection probability of 0.05."

Thesis or dissertation phrasing

"To minimise selection bias, a simple random sample was drawn. Every unit in the sampling frame received a unique identifier from 1 to N, and n identifiers were generated using a seeded random number generator, ensuring the draw is fully reproducible."

Plain-language summary

"We picked our sample completely at random from a full list, so the people we studied fairly represent everyone on that list."

Always report the seed or the random number source. It is the difference between a method a reader can reproduce and a claim they have to take on faith.

Common mistakes and how to avoid them

Most failures in simple random sampling happen before any number is drawn. Watch for these six.

  • An incomplete sampling frame. If units are missing from your list, they have zero chance of selection, and your "random" sample is quietly biased. Fix the frame first — design-based inference frameworks for ecological data spell out exactly why this assumption matters.
  • Confusing random with haphazard. Picking units that feel arbitrary is not random. Only a generator or a random number table delivers true randomness.
  • Sampling with replacement by accident. Real studies almost always want without replacement. Check the default in whatever tool you use.
  • Ignoring the finite population correction. When the sampling fraction tops 5%, leaving out the FPC overstates your uncertainty. Newer finite population variance estimators continue to refine the classic FPC for special cases.
  • Ignoring non-response. If part of your sample doesn't respond, a textbook simple random sample can quietly turn into a biased one. Adjusted estimators exist for exactly this case, but only if you flag the problem first.
  • Not recording the seed. Without it, you cannot reproduce the draw, and reproducibility is the whole point of a probability method. In wildlife surveys, where designs are often updated from one survey to the next, a logged seed is what lets the next field season pick up where the last one left off.

Conclusion: why this method still earns its place

Simple random sampling is old, plain, and quietly powerful. It is the method that makes "we can generalise to the population" an honest statement rather than a hopeful one. Every fancier design, stratified, cluster, multistage, is built on top of the same equal-chance logic you have just seen.

The advantages and disadvantages of simple random sampling come down to a trade. You gain freedom from selection bias and a clean path to confidence intervals and tests. You give up some efficiency when populations are spread out or when rare subgroups need protecting. For most well-listed populations, that trade is a clear win.

Here is what I would hold onto:

  • The selection probability n / N is the engine of the whole method, so know your N and your n exactly.
  • Spend your effort on a complete sampling frame; the random draw itself is trivial once a tool does it.
  • Use a seed, report it, and apply the finite population correction whenever your sampling fraction climbs above 5%.
  • If subgroups or geography dominate your problem, reach for stratified or cluster sampling instead.

Draw a practice sample in the tool above, then summarise your real data with a descriptive tool like the median calculator, the mode calculator, or the mean absolute deviation calculator. Once you have a feel for the method, the same equal-chance logic carries straight over to stratified, cluster, and multi-stage designs.

Frequently asked questions

Q1. What is simple random sampling in simple terms?

It is picking a sample so that every member of the population has the same chance of being chosen, like drawing names from a hat. Because nothing about a unit makes it more or less likely to be picked, the sample fairly represents the whole population and you can generalise your results to it.

Q2. How do you select a simple random sample step by step?

Step 1: list every unit in the population. Step 2: number them 1 to N. Step 3: choose your sample size n. Step 4: use a random number generator or table to pick n unique numbers, skipping anything out of range or already chosen. Step 5: match the numbers to units. The browser tool on this page does steps 2 to 4 for you.

Q3. What is the formula for simple random sampling?

The core formula is the selection probability, n / N, where n is the sample size and N is the population size. The sampling fraction f equals n / N as well. For estimating a mean, the standard error is (s / √n) × √(1 − n/N), where the last term is the finite population correction.

Q4. What is the difference between simple random sampling with and without replacement?

Without replacement, a unit can be chosen only once, which is the standard for real studies. With replacement, the same unit can be selected more than once, since each draw is independent. Without replacement gives a smaller standard error because of the finite population correction; with replacement removes that correction.

Q5. What is the difference between simple random and systematic sampling?

Simple random sampling draws each unit independently using random numbers, so there is no pattern. Systematic sampling picks every k-th unit after a random start, which is quicker to administer but can be biased if the list contains a repeating cycle that lines up with the interval k. With a well-shuffled list, the two give similar results.

Q6. When should you not use simple random sampling?

Avoid it when you lack a complete list of the population, when units are spread across a large area and travel is costly, or when an important subgroup is rare and a random draw might miss it. In those cases stratified, cluster, or multistage sampling usually performs better.

Q7. What sample size do I need for a simple random sample?

It depends on your margin of error, confidence level, and population size. For a proportion at 95% confidence and a 5% margin, a Cochran-style formula gives about 385, which shrinks for smaller populations. A 2,000-person population needs roughly 322. Use a sample size calculator to plug in your own numbers.

Q8. How do I do simple random sampling in Excel or R?

In Excel: put your list in a column, add a helper column with =RAND(), sort by it, and take the top n rows. In R: use sample(population, n) for a draw without replacement. In Python: use random.sample(population, n). Set a seed first (set.seed() in R) so the draw is reproducible.

Q9. What are the advantages and disadvantages of simple random sampling?

Advantages: it removes selection bias, is easy to explain, and produces unbiased estimates with clean confidence intervals and tests. Disadvantages: it needs a complete list of the population, can scatter field sites across a wide area, and may under-represent small subgroups. When those drawbacks bite, stratified or cluster sampling helps.

Q10. Is simple random sampling still used in research today?

Yes. It remains the default probability method in surveys, clinical research, ecology, and quality control, and it is the foundation for the maths behind most statistical tests. Larger national surveys often combine it with stratification or clustering, but the equal-chance principle of simple random sampling sits at the centre of those designs.

Q11. What is the difference between simple random sampling and stratified sampling?

Simple random sampling draws straight from the whole population with no groups. Stratified sampling first splits the population into groups (strata) that share a key trait, such as age band or habitat type, then draws a simple random sample within each group. Stratified sampling usually gives more precise estimates when those groups differ a lot from each other, at the cost of needing more information about the population upfront.

Q12. How is simple random sampling used in research methodology?

In a methods section, simple random sampling is the step that turns a list of potential participants, sites, or records (the sampling frame) into the actual sample you study. Researchers describe the frame, the sample size and how it was justified, and the exact draw method, including the seed if a computer generator was used. This lets reviewers and other researchers judge how well the sample represents the population and reproduce the selection if needed.

References

The following peer-reviewed papers support the methods covered in this guide, spanning the theory of probability sampling, the finite population correction, sample size determination, and applications of simple random sampling in survey and ecological research.

[1]Horvitz, D. G., & Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47(260), 663–685. https://doi.org/10.1080/01621459.1952.10483446
[2]Yates, F., & Grundy, P. M. (1953). Selection without replacement from within strata with probability proportional to size. Journal of the Royal Statistical Society, Series B, 15(2), 253–261. https://doi.org/10.1111/j.2517-6161.1953.tb00140.x
[3]Matsumoto, M., & Nishimura, T. (1998). Mersenne twister: A 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Transactions on Modeling and Computer Simulation, 8(1), 3–30. https://doi.org/10.1145/272991.272995
[4]Lenth, R. V. (2001). Some practical guidelines for effective sample size determination. The American Statistician, 55(3), 187–193. https://doi.org/10.1198/000313001317098149
[5]Underwood, F. M. (2012). A framework for adapting survey design through time for wildlife population assessment. Environmental and Ecological Statistics, 19, 413–436. https://doi.org/10.1007/s10651-012-0193-4
[6]Qian, J. (2020). Variance estimation with complex data and finite population correction—a paradigm for comparing jackknife and formula-based methods. ETS Research Report Series, 2020(1). https://doi.org/10.1002/ets2.12294
[7]Wu, C., & Thompson, M. E. (2020). Sampling Theory and Practice. Springer. https://doi.org/10.1007/978-3-030-44246-0
[8]Williams, B. K., & Brown, E. D. (2019). Sampling and analysis frameworks for inference in ecology. Methods in Ecology and Evolution, 10(11), 1832–1842. https://doi.org/10.1111/2041-210X.13279
[9]Borchers, D. L., Stevenson, B. C., Kidney, D., Thomas, L., & Marques, T. A. (2015). A unifying model for capture–recapture and distance sampling surveys of wildlife populations. Journal of the American Statistical Association, 110(509), 195–209. https://doi.org/10.1080/01621459.2014.893884
[10]Ahmad, S., Adichwal, N. K., Aamir, M., Shabbir, J., Alsadat, N., Elgarhy, M., & Ahmad, H. (2023). An enhanced estimator of finite population variance using two auxiliary variables under simple random sampling. Scientific Reports, 13, 21318. https://doi.org/10.1038/s41598-023-44169-5
[11]Hussain, M., Zaman, Q., Khan, L., Metawa, A. E., Awwad, F. A., Ismail, E. A. A., Wasim, D., & Ahmad, H. (2024). Improved exponential type mean estimators for non-response case using two concomitant variables in simple random sampling. Heliyon, 10(6), e27535. https://doi.org/10.1016/j.heliyon.2024.e27535
[12]Shabbir, J., & Movaheedi, Z. (2024). Use of generalized randomized response model for enhancement of finite population variance: A simulation approach. PLOS ONE, 19(12), e0315658. https://doi.org/10.1371/journal.pone.0315658
[13]Ahmed, S. K. (2024). How to choose a sampling technique and determine sample size for research: A simplified guide for researchers. Oral Oncology Reports, 12, 100662. https://doi.org/10.1016/j.oor.2024.100662
[14]Amin, H. A. (2024). The sample size determination strategy in the simple random sampling design. Tikrit Journal of Administrative and Economic Sciences, 20(68, part 2), 493–505. https://doi.org/10.25130/tjaes.20.68.2.26
[15]Shakoor, F., Atif, M., Ali, H., Alshammari, A. O., Himmat, B., & Kefi, K. (2025). Log-ratio type estimation for the finite population mean under simple random sampling without replacement with theory, simulation and application. Scientific Reports, 15. https://doi.org/10.1038/s41598-025-29127-7

Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post
Next Post

© 2026 STATS UNLOCK . statsunlock.com –  All Rights Reserved. Â