01. Distribution Parameters

02. Significance Threshold (α)

03. Statistical Evidence

Calculated p-value
---

Enter parameters to calculate

For learning and exploration only. Not a substitute for professional statistical review or clinical-trial analysis.

Reference

Code reference: compute a p-value from a test statistic in any language

The tool above converts a test statistic into a p-value in your browser. The reference below is what you need when computing the same p-values in code or pulling critical values from a table: tail formulas, language-by-language snippets for Z/T/chi-square/F, the standard critical values students still pull on paper, and the common pitfalls when a pasted number does not match what the textbook says.

Tail formulas

These are the calculations the tool runs. Same formulas across every library and language.

TailFormulaNotes
Left-tailedp = CDF(t)One-sided, predicted lower direction
Right-tailedp = 1 - CDF(t) (or SF(t) if available)One-sided, predicted upper direction
Two-tailed (symmetric: Z, T)p = 2 * min(CDF(t), 1 - CDF(t))Equivalent to 2 * (1 - CDF(abs(t)))
Two-tailed (asymmetric: chi-square, F)p = 1 - CDF(t)Test is intrinsically upper-tail

A common mistake is applying the symmetric two-tailed formula to chi-square or F. Both are bounded at zero, so the “extreme” side is the upper tail only. Calling them two-tailed in textbook language usually still means computing 1 - CDF(t).

Python (scipy.stats)

from scipy import stats

# Z (standard normal)
p_two   = 2 * (1 - stats.norm.cdf(abs(z)))
p_left  = stats.norm.cdf(z)
p_right = stats.norm.sf(z)            # 1 - CDF, more numerically stable

# T (Student's T)
p_two   = 2 * (1 - stats.t.cdf(abs(t), df))
p_left  = stats.t.cdf(t, df)
p_right = stats.t.sf(t, df)

# Chi-square (always upper tail for goodness-of-fit, contingency)
p = stats.chi2.sf(x, df)

# F (always upper tail for ANOVA, variance ratio)
p = stats.f.sf(F, df1, df2)

stats.t.sf(t, df) is preferred over 1 - stats.t.cdf(t, df) when the statistic is far in the tail. The subtraction loses precision near 1.

R (base + stats)

DistributionLower tailUpper tailTwo-tailed (symmetric)
Zpnorm(z)pnorm(z, lower.tail = FALSE)2 * pnorm(-abs(z))
Tpt(t, df)pt(t, df, lower.tail = FALSE)2 * pt(-abs(t), df)
Chi-squarepchisq(x, df)pchisq(x, df, lower.tail = FALSE)(use upper)
Fpf(F, df1, df2)pf(F, df1, df2, lower.tail = FALSE)(use upper)

The lower.tail = FALSE flag is R’s equivalent of sf() in scipy. It computes the right-tail probability directly and avoids precision loss for extreme statistics.

JavaScript (jStat)

import jStat from 'jstat';

// Z
const pZRight = 1 - jStat.normal.cdf(z, 0, 1);
const pZTwo   = 2 * (1 - jStat.normal.cdf(Math.abs(z), 0, 1));

// T
const pTTwo   = 2 * (1 - jStat.studentt.cdf(Math.abs(t), df));

// Chi-square
const pChi    = 1 - jStat.chisquare.cdf(x, df);

// F
const pF      = 1 - jStat.centralF.cdf(F, df1, df2);

jStat does not expose a survival function, so subtracting from 1 is the standard idiom. For very small p values, switch to stdlib.js, which has dedicated upper-tail routines.

Spreadsheets

TestExcel / Google Sheets
Z, two-tailed=2*(1-NORM.S.DIST(ABS(z), TRUE))
Z, right-tailed=1-NORM.S.DIST(z, TRUE)
T, two-tailed=T.DIST.2T(ABS(t), df)
T, right-tailed=T.DIST.RT(t, df)
Chi-square, upper=CHISQ.DIST.RT(x, df)
F, upper=F.DIST.RT(F, df1, df2)

LibreOffice mirrors the Excel names. Older Excel (pre-2010) used legacy names like TDIST, CHIDIST, and FDIST. These return upper-tail probabilities and are still supported in compatibility mode.

SQL and statistical platforms

PlatformRight-tail T-test p
Statadisplay ttail(df, t)
SASp = 1 - probt(t, df);
SPSS syntaxCOMPUTE p = 1 - CDF.T(t, df).
MATLAB1 - tcdf(t, df)
Juliaccdf(TDist(df), t) (Distributions.jl)
PostgreSQL (with pl/R)pt(t, df, lower.tail = FALSE)

Pure SQL has no built-in statistical CDFs. Production setups call out to R (via pl/R), Python (via pl/Python), or a separate stats microservice.

Z critical values

For when a paper reports a z-score and you need the cutoff on hand.

AlphaOne-tailed z*Two-tailed z*
0.101.2821.645
0.051.6451.960
0.0251.9602.241
0.012.3262.576
0.0052.5762.807
0.0013.0903.291
5-sigma (particle physics)N/A5.000

T critical values (two-tailed)

Each cell is the |t| at which 2 * (1 - pt(|t|, df)) equals the column alpha. For one-tailed, halve the column alpha.

dfα = 0.10α = 0.05α = 0.01α = 0.001
16.31412.70663.657636.619
52.0152.5714.0326.869
101.8122.2283.1694.587
151.7532.1312.9474.073
201.7252.0862.8453.850
301.6972.0422.7503.646
601.6712.0002.6603.460
1201.6581.9802.6173.373
∞ (Z)1.6451.9602.5763.291

As df grows, the T critical values converge on the corresponding Z values. By df = 120, the difference is in the third decimal.

Chi-square critical values (upper tail)

dfα = 0.10α = 0.05α = 0.01α = 0.001
12.7063.8416.63510.828
24.6055.9919.21013.816
59.23611.07015.08620.515
1015.98718.30723.20929.588
2028.41231.41037.56645.315
3040.25643.77350.89259.703

A 2-by-2 contingency table has df = 1. A 3-by-3 has df = 4. A goodness-of-fit on k categories has df = k - 1, minus any parameters estimated from the data.

F critical values (upper tail, α = 0.05)

ANOVA’s most common cutoff. df1 across, df2 down.

df2 \ df112351020
56.615.795.415.054.744.56
104.964.103.713.332.982.77
204.353.493.102.712.352.12
304.173.322.922.532.161.93
604.003.152.762.371.991.75
3.843.002.602.211.831.57

For α = 0.01, multiply roughly by 1.6 to 2.4 depending on df1, or compute qf(0.99, df1, df2) in R for the exact value.

Common pitfalls

SymptomCauseFix
Excel returns #NUM! for T.DIST.2T(-2, 10)T.DIST.2T requires a non-negative argumentWrap in ABS(): =T.DIST.2T(ABS(t), df)
Pasted p does not match scipy / RMixed one-tailed and two-tailed conventionsRe-check the tail; halve or double as needed
Chi-square test reported as “two-tailed”Chi-square is intrinsically upper-tailCompute 1 - CDF, ignore the two-tailed prompt
Welch's T p slightly different from a pooled-T calculatorWelch uses Welch-Satterthwaite df (often non-integer)Use the non-integer df your stats package reported
Reported p = 0.0000 in a tableRounded display, real value below the precision floorSwitch to < 0.0001 or report exp(...) from raw
p = 1.000 returned for a tiny statisticStatistic in the wrong tail for the test you specifiedFlip to the opposite one-tailed test, or use two-tailed
df = 0 returns NaN / infDistribution undefined at df = 0Recompute df; one-sample T needs n - 1, not n
Bonferroni-corrected p > 1Manual p * k exceeded 1Cap at 1: min(p * k, 1)

Precision and reporting conventions

ConventionFormatWhere used
Exact value, three significant figuresp = 0.0312APA 7th edition, most journals
Below display floorp < 0.0001When the real value is smaller than reportable precision
Star notation (avoid in modern reports)*p < 0.05, **p < 0.01, ***p < 0.001Older social-science tables only
Cohen's d alongside pt(18) = 2.05, p = 0.055, d = 0.96Required by most psychology journals
Effect size with confidence intervalr = 0.42, 95% CI [0.18, 0.61], p = 0.003Preferred over star notation in clinical research

Most journals dropped star formats between 2015 and 2020 in favor of exact values plus effect sizes. APA 7 explicitly requires the exact value to three significant figures when feasible.

Related concepts

  • CDF, SF, PPF: Cumulative distribution function, survival function (1 - CDF, computed directly for tail precision), and percent-point function (the inverse CDF that returns the critical value for a given alpha).
  • Bonferroni correction: Divide alpha by the number of tests when running multiple comparisons. The reported p stays unchanged; only the threshold moves. Cap corrected p-values at 1.
  • Holm-Bonferroni and Benjamini-Hochberg: Less conservative multiple-comparison corrections. Standard in genomics, neuroimaging, and any high-throughput testing pipeline.
  • TOST (two one-sided tests): Equivalence and non-inferiority testing reverses the null hypothesis. Run in scipy (statsmodels.stats.weightstats) or R (equivalence package); a generic p-value calculator does not compute these directly.