What is the formula for the two-proportion z-test?

z = (p̂1 − p̂2) / √[p̂(1−p̂)(1/n1 + 1/n2)], where p̂ = (x1 + x2)/(n1 + n2) is the pooled proportion under H0. The two-proportion z-test uses the pooled standard error under H0 and an unpooled SE for the confidence interval on the difference.

How do I interpret Cohen's h for two proportions?

Cohen's h = 2·arcsin(√p̂1) − 2·arcsin(√p̂2). It quantifies the effect size on the arcsine scale. Conventional thresholds: |h| < 0.2 = negligible, 0.2–0.5 = small, 0.5–0.8 = medium, ≥ 0.8 = large.

Can I use a two-proportion z-test when one group has very few successes?

No. If any of n1·p̂1, n1·(1−p̂1), n2·p̂2, or n2·(1−p̂2) falls below 10 (some sources use 5), the normal approximation fails. Use Fisher's exact test or a likelihood-ratio chi-square test instead.

What is the relationship between a two-proportion z-test and a 2x2 chi-square test?

Mathematically equivalent for a two-sided test: z² = χ² with 1 degree of freedom. The two-proportion z-test additionally allows one-sided alternatives, whereas the chi-square test is inherently two-sided.

How do I report two-proportion z-test results in APA style?

Report as: 'A two-proportion z-test showed that the success rate in Group 1 (p̂1 = .XX, n1 = ?) was significantly higher than in Group 2 (p̂2 = .XX, n2 = ?), z = X.XX, p = .XXX, 95% CI for difference [LL, UL], Cohen's h = X.XX.' Include both sample sizes and the CI on the difference.

What sample size do I need for a two-proportion z-test?

For 80% power at α = .05 (two-sided) to detect a moderate effect (h ≈ 0.5), you need roughly n ≈ 32 per group. For small effects (h ≈ 0.2), n ≈ 200 per group. The Power Curve in this tool shows the exact relationship for your data.

Is a two-proportion z-test the same as a difference-in-proportions test?

Yes — both refer to the same test that compares two independent proportions. It is also called the two-sample proportion test, two-sample z-test for proportions, or simply the z-test for difference in proportions.

Two-Proportion z-Test Calculator (Free Online Tool) | StatsUnlock

Two-Proportion z-Test Calculator

Compare two independent proportions, get the test statistic, p-value, confidence interval on the difference, Cohen's h, statistical power, and ready-to-paste APA reporting templates — all in one tool.

Parametric Two-Sample Hypothesis Test Proportions z-Test Pooled SE

📥 Data Input

📚 Load Sample Dataset:

Enter Group 1 and Group 2 as comma-separated binary values (1 = success, 0 = failure) or numeric measurements (the threshold below converts them to binary). Each pair of clusters runs its own two-proportion z-test.

Upload CSV or Excel file:

Supports .csv, .txt, .xlsx, .xls — headers auto-detected. ✨ Multi-column mode: click any number of columns to load each as a separate cluster. Columns are paired in order (1+2, 3+4, …) to form Group 1 vs Group 2 comparisons.

Enter summary counts directly. Suitable when you have successes and sample size from each group already.

Group 1

Group 1 Name:

Successes (x₁):

Sample size (n₁):

Group 2

Group 2 Name:

Successes (x₂):

Sample size (n₂):

⚙️ Test Configuration

Alternative Hypothesis:

Significance level (α):

Threshold (numeric → binary):

If your data is not 0/1, enter a cutoff: values > threshold count as success.

Standard Error Method:

Test statistic uses pooled SE by convention; CI is reported using unpooled SE.

Continuity correction:

Yates correction is more conservative; recommended for small samples.

📊 Results

Full Statistical Output

Per-Pair Summary (Multi-Cluster Mode)

📊 Visualizations

Four publication-ready, colorful plots auto-generated from your data — hover any chart for tooltips with full statistics.

🔔 Standard Normal Curve with Rejection Region

⚡ Power Curve — Effect of Sample Size

🎯 Effect Size (Cohen's h) Gauge

🌈 Sampling Distribution of (p̂₁ − p̂₂) under H₀

Assumption Checks

💡 Plain-Language Interpretation

✍️ How to Write Your Results in Research

Five ready-to-use reporting templates auto-filled with your computed values — click 📋 Copy on any card to paste it directly into your manuscript, thesis, report, abstract, or pre-registration document.

🏁 Detailed Conclusion

📐 Technical Notes & Formulas

A. Test Statistic (Pooled)

z = (p̂₁ − p̂₂) / √[ p̂(1 − p̂) · (1/n₁ + 1/n₂) ]

where the pooled proportion under H₀ is:

p̂ = (x₁ + x₂) / (n₁ + n₂)

B. p-value

Two-tailed: p = 2 · [1 − Φ(|z|)]
Right-tailed: p = 1 − Φ(z)
Left-tailed: p = Φ(z)

C. Confidence Interval on (p̂₁ − p̂₂)

CI = (p̂₁ − p̂₂) ± z_{1−α/2} · √[ p̂₁(1 − p̂₁)/n₁ + p̂₂(1 − p̂₂)/n₂ ]

Unpooled SE is used for the CI because under H₁ we no longer assume p₁ = p₂.

D. Effect Size (Cohen's h)

h = 2·arcsin(√p̂₁) − 2·arcsin(√p̂₂)

Interpretation: |h| < 0.2 = negligible; 0.2–0.5 = small; 0.5–0.8 = medium; ≥ 0.8 = large (Cohen, 1988).

E. Continuity Correction (Yates)

When applied, the numerator is shrunk by ½·(1/n₁ + 1/n₂):

z_yates = (|p̂₁ − p̂₂| − ½(1/n₁ + 1/n₂)) / SE_pooled

F. Assumptions

Two independent random samples
Each observation is binary (0/1)
Normal approximation valid: n₁·p̂₁, n₁·(1−p̂₁), n₂·p̂₂, n₂·(1−p̂₂) all ≥ 10
If counts < 10 in any cell, use Fisher's exact test

✅ When to Use the Two-Proportion z-Test

Use this test when you have:

Two independent groups (no overlapping subjects between groups)
A binary outcome in each group (success/failure, present/absent, yes/no, treated/control)
Reasonably large samples in both groups (n·p̂ and n·(1−p̂) ≥ 10)
A specific research question about whether the two underlying population proportions differ

Typical examples

Clinical trial: recovery rate in drug arm vs placebo arm
A/B testing: conversion rate in design A vs design B
Education: pass rate under new vs old curriculum
Ecology: species occurrence rate at site 1 vs site 2
Policy: voter approval rate before vs after campaign (independent samples)

Decision guide

If samples are paired (same subjects measured twice), use McNemar's test instead
If any expected cell count < 10, use Fisher's exact test
If comparing 3+ proportions, use chi-square test of homogeneity or pairwise z-tests with Bonferroni correction

📘 How to Use This Calculator (10 Steps)

Load sample data from the dropdown OR paste your own binary values (0/1) into the Group 1 and Group 2 text areas.
Edit group names by clicking the editable name field on each cluster card.
Alternative input methods: use Upload CSV/Excel for spreadsheet data, or Manual Entry to type x and n directly.
Pick the alternative hypothesis — two-tailed unless you have a strong directional prediction.
Choose α (significance level), most commonly 0.05.
Set a threshold if your data is numeric (e.g., age > 18) instead of binary.
Choose continuity correction — apply Yates if either n₁ or n₂ < 50.
Click "Run Two-Proportion z-Test" to compute.
Read the interpretation, conclusion, and reporting examples auto-filled with your computed values.
Export via Download Doc (.txt) or Download PDF for inclusion in your paper.

❓ Frequently Asked Questions

Q1. What is a two-proportion z-test?

A two-proportion z-test compares two independent proportions (e.g., success rates from two groups) to determine whether they differ significantly. It uses the standard normal distribution to test the null hypothesis that p₁ = p₂.

Q2. When should I use a two-proportion z-test vs a chi-square test?

For a 2×2 contingency table with a two-sided alternative, they are mathematically equivalent (z² = χ²). Use the z-test when you want a one-sided alternative or a directional p-value, and the chi-square when you want the more general framework or to extend to larger tables.

Q3. What is the difference between pooled and unpooled standard error?

Pooled SE assumes p₁ = p₂ under H₀ and uses the combined proportion p̂ = (x₁+x₂)/(n₁+n₂). It is appropriate for the hypothesis test itself. Unpooled SE uses each group's own proportion and is appropriate for the confidence interval on the difference (p̂₁ − p̂₂), since under H₁ we no longer assume the proportions are equal.

Q4. How do I interpret Cohen's h?

Cohen's h transforms each proportion via the arcsine-square-root function, producing a metric that is symmetric and roughly normally distributed under H₀. Thresholds (Cohen 1988): |h| < 0.2 = negligible; 0.2–0.5 = small; 0.5–0.8 = medium; ≥ 0.8 = large. Unlike raw proportion differences, Cohen's h is comparable across studies with different base rates.

Q5. Can I use this test with very small samples?

No. If any of n₁·p̂₁, n₁·(1−p̂₁), n₂·p̂₂, or n₂·(1−p̂₂) falls below 10, the normal approximation fails and the z-test gives unreliable p-values. Switch to Fisher's exact test for small samples or sparse 2×2 tables.

Q6. Should I always use a two-tailed test?

Two-tailed is the safer default. Only use one-tailed when you have a clear pre-registered directional hypothesis (e.g., "the new drug will have a higher recovery rate than placebo, and we will not interpret a lower rate"). One-tailed tests are sometimes considered controversial without pre-registration.

Q7. What does the Yates continuity correction do?

It subtracts ½·(1/n₁ + 1/n₂) from the absolute numerator to compensate for using a continuous (normal) approximation to a discrete (binomial) distribution. It shrinks the z-statistic toward 0 and increases the p-value — making the test more conservative. Useful when sample sizes are small (n < 50 per group).

Q8. How do I report results in APA style?

Use this template: "A two-proportion z-test indicated that the success rate in Group 1 (p̂₁ = .XX, n₁ = ?) differed significantly from Group 2 (p̂₂ = .XX, n₂ = ?), z = X.XX, p = .XXX, 95% CI for difference [LL, UL], Cohen's h = X.XX." Include both sample sizes, the CI on the difference, and the effect size.

Q9. How large a sample do I need?

For 80% power at α = .05 (two-sided) to detect a moderate effect (Cohen's h ≈ 0.5), about n ≈ 32 per group. For small effects (h ≈ 0.2), n ≈ 200 per group. For large effects (h ≈ 0.8), n ≈ 15 per group. The Power Curve in this tool shows the exact relationship for your data.

Q10. What if my two groups have very different sample sizes?

The test is still valid as long as the assumption conditions (n·p̂, n·(1−p̂) ≥ 10) are met in each group separately. Unequal n reduces power compared to balanced designs with the same total N — efficiency is maximized when n₁ = n₂.

📚 References

The following references support the statistical methods used in this two-proportion z-test calculator, covering pooled vs unpooled SE, p-value interpretation, effect size, and best practices in comparing proportions.

Agresti, A., & Caffo, B. (2000). Simple and effective confidence intervals for proportions and differences of proportions result from adding two successes and two failures. The American Statistician, 54(4), 280–288. https://doi.org/10.1080/00031305.2000.10474560
Newcombe, R. G. (1998). Interval estimation for the difference between independent proportions: Comparison of eleven methods. Statistics in Medicine, 17(8), 873–890. https://doi.org/10.1002/(SICI)1097-0258(19980430)17:8<873::AID-SIM779>3.0.CO;2-I
Fleiss, J. L., Levin, B., & Paik, M. C. (2003). Statistical methods for rates and proportions (3rd ed.). John Wiley & Sons. https://doi.org/10.1002/0471445428
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.
Agresti, A. (2018). An introduction to categorical data analysis (3rd ed.). John Wiley & Sons.
Yates, F. (1934). Contingency tables involving small numbers and the χ² test. Supplement to the Journal of the Royal Statistical Society, 1(2), 217–235. https://doi.org/10.2307/2983604
Wald, A. (1943). Tests of statistical hypotheses concerning several parameters when the number of observations is large. Transactions of the American Mathematical Society, 54(3), 426–482. https://doi.org/10.1090/S0002-9947-1943-0012401-3
Field, A. (2018). Discovering statistics using IBM SPSS statistics (5th ed.). SAGE Publications.
Gravetter, F. J., & Wallnau, L. B. (2017). Statistics for the behavioral sciences (10th ed.). Cengage Learning.
American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed.). APA. https://doi.org/10.1037/0000165-000
Wasserstein, R. L., & Lazar, N. A. (2016). The ASA statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129–133. https://doi.org/10.1080/00031305.2016.1154108
Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, P values, confidence intervals, and power: A guide to misinterpretations. European Journal of Epidemiology, 31(4), 337–350. https://doi.org/10.1007/s10654-016-0149-3
R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
NIST/SEMATECH. (2013). e-Handbook of statistical methods: 2-Sample Z Test for proportions. National Institute of Standards and Technology. https://www.itl.nist.gov/div898/handbook/prc/section3/prc33.htm
STATS UNLOCK. (2026). Two-Proportion z-Test Calculator. https://statsunlock.com/two-proportion-z-test-calculator/

📝 Cite This Tool

If you used the Two-Proportion z-Test Calculator in your research, thesis, dissertation, manuscript, or report, please cite it using the format below. Click 📋 Copy to copy the citation directly to your clipboard.

APA 7th Edition

STATS UNLOCK. (2026). Two-Proportion z-Test Calculator. Retrieved from https://statsunlock.com/two-proportion-z-test-calculator/