01. Distribution Parameters

Test Statistic (z-score)

Degrees of Freedom (df)

Hypothesis Direction (Tails)

02. Significance Threshold (α)

03. Statistical Evidence

Calculated p-value

---

Enter parameters to calculate

For learning and exploration only. Not a substitute for professional statistical review or clinical-trial analysis.

Reference

Code reference: compute a p-value from a test statistic in any language

Why does a pasted p-value sometimes refuse to match the textbook? Usually it is a one-tailed versus two-tailed mismatch, or a symmetric formula misapplied to chi-square. What follows is the working material for getting it right in code: the tail formulas, language-by-language snippets for Z/T/chi-square/F, the critical values students still pull on paper, and the pitfalls that throw the number off.

Tail formulas

These are the calculations the tool runs. Same formulas across every library and language.

Tail	Formula	Notes
Left-tailed	p = CDF(t)	One-sided, predicted lower direction
Right-tailed	p = 1 - CDF(t) (or SF(t) if available)	One-sided, predicted upper direction
Two-tailed (symmetric: Z, T)	p = 2 * min(CDF(t), 1 - CDF(t))	Equivalent to 2 * (1 - CDF(abs(t)))
Two-tailed (asymmetric: chi-square, F)	p = 1 - CDF(t)	Test is intrinsically upper-tail

A common mistake is applying the symmetric two-tailed formula to chi-square or F. Both are bounded at zero, so the “extreme” side is the upper tail only. Calling them two-tailed in textbook language usually still means computing 1 - CDF(t).

Python (scipy.stats)

from scipy import stats

# Z (standard normal)
p_two   = 2 * (1 - stats.norm.cdf(abs(z)))
p_left  = stats.norm.cdf(z)
p_right = stats.norm.sf(z)            # 1 - CDF, more numerically stable

# T (Student's T)
p_two   = 2 * (1 - stats.t.cdf(abs(t), df))
p_left  = stats.t.cdf(t, df)
p_right = stats.t.sf(t, df)

# Chi-square (always upper tail for goodness-of-fit, contingency)
p = stats.chi2.sf(x, df)

# F (always upper tail for ANOVA, variance ratio)
p = stats.f.sf(F, df1, df2)

stats.t.sf(t, df) is preferred over 1 - stats.t.cdf(t, df) when the statistic is far in the tail. The subtraction loses precision near 1.

R (base + stats)

Distribution	Lower tail	Upper tail	Two-tailed (symmetric)
Z	pnorm(z)	pnorm(z, lower.tail = FALSE)	2 * pnorm(-abs(z))
T	pt(t, df)	pt(t, df, lower.tail = FALSE)	2 * pt(-abs(t), df)
Chi-square	pchisq(x, df)	pchisq(x, df, lower.tail = FALSE)	(use upper)
F	pf(F, df1, df2)	pf(F, df1, df2, lower.tail = FALSE)	(use upper)

The lower.tail = FALSE flag is R’s equivalent of sf() in scipy. It computes the right-tail probability directly and avoids precision loss for extreme statistics.

JavaScript (jStat)

import jStat from 'jstat';

// Z
const pZRight = 1 - jStat.normal.cdf(z, 0, 1);
const pZTwo   = 2 * (1 - jStat.normal.cdf(Math.abs(z), 0, 1));

// T
const pTTwo   = 2 * (1 - jStat.studentt.cdf(Math.abs(t), df));

// Chi-square
const pChi    = 1 - jStat.chisquare.cdf(x, df);

// F
const pF      = 1 - jStat.centralF.cdf(F, df1, df2);

jStat does not expose a survival function, so subtracting from 1 is the standard idiom. For very small p values, switch to stdlib.js, which has dedicated upper-tail routines.

Spreadsheets

Test	Excel / Google Sheets
Z, two-tailed	=2*(1-NORM.S.DIST(ABS(z), TRUE))
Z, right-tailed	=1-NORM.S.DIST(z, TRUE)
T, two-tailed	=T.DIST.2T(ABS(t), df)
T, right-tailed	=T.DIST.RT(t, df)
Chi-square, upper	=CHISQ.DIST.RT(x, df)
F, upper	=F.DIST.RT(F, df1, df2)

LibreOffice mirrors the Excel names. Older Excel (pre-2010) used legacy names like TDIST, CHIDIST, and FDIST. These return upper-tail probabilities and are still supported in compatibility mode.

SQL and statistical platforms

Platform	Right-tail T-test p
Stata	display ttail(df, t)
SAS	p = 1 - probt(t, df);
SPSS syntax	COMPUTE p = 1 - CDF.T(t, df).
MATLAB	1 - tcdf(t, df)
Julia	ccdf(TDist(df), t) (Distributions.jl)
PostgreSQL (with pl/R)	pt(t, df, lower.tail = FALSE)

Pure SQL has no built-in statistical CDFs. Production setups call out to R (via pl/R), Python (via pl/Python), or a separate stats microservice.

Z critical values

For when a paper reports a z-score and you need the cutoff on hand.

Alpha	One-tailed z*	Two-tailed z*
0.10	1.282	1.645
0.05	1.645	1.960
0.025	1.960	2.241
0.01	2.326	2.576
0.005	2.576	2.807
0.001	3.090	3.291
5-sigma (particle physics)	N/A	5.000

T critical values (two-tailed)

Each cell is the |t| at which 2 * (1 - pt(|t|, df)) equals the column alpha. For one-tailed, halve the column alpha.

df	α = 0.10	α = 0.05	α = 0.01	α = 0.001
1	6.314	12.706	63.657	636.619
5	2.015	2.571	4.032	6.869
10	1.812	2.228	3.169	4.587
15	1.753	2.131	2.947	4.073
20	1.725	2.086	2.845	3.850
30	1.697	2.042	2.750	3.646
60	1.671	2.000	2.660	3.460
120	1.658	1.980	2.617	3.373
∞ (Z)	1.645	1.960	2.576	3.291

As df grows, the T critical values converge on the corresponding Z values. By df = 120, the difference is in the third decimal.

Chi-square critical values (upper tail)

df	α = 0.10	α = 0.05	α = 0.01	α = 0.001
1	2.706	3.841	6.635	10.828
2	4.605	5.991	9.210	13.816
5	9.236	11.070	15.086	20.515
10	15.987	18.307	23.209	29.588
20	28.412	31.410	37.566	45.315
30	40.256	43.773	50.892	59.703

A 2-by-2 contingency table has df = 1. A 3-by-3 has df = 4. A goodness-of-fit on k categories has df = k - 1, minus any parameters estimated from the data.

F critical values (upper tail, α = 0.05)

ANOVA’s most common cutoff. df1 across, df2 down.

df2 \ df1	1	2	3	5	10	20
5	6.61	5.79	5.41	5.05	4.74	4.56
10	4.96	4.10	3.71	3.33	2.98	2.77
20	4.35	3.49	3.10	2.71	2.35	2.12
30	4.17	3.32	2.92	2.53	2.16	1.93
60	4.00	3.15	2.76	2.37	1.99	1.75
∞	3.84	3.00	2.60	2.21	1.83	1.57

For α = 0.01, multiply roughly by 1.6 to 2.4 depending on df1, or compute qf(0.99, df1, df2) in R for the exact value.

Common pitfalls

Symptom	Cause	Fix
Excel returns #NUM! for T.DIST.2T(-2, 10)	T.DIST.2T requires a non-negative argument	Wrap in ABS(): =T.DIST.2T(ABS(t), df)
Pasted p does not match scipy / R	Mixed one-tailed and two-tailed conventions	Re-check the tail; halve or double as needed
Chi-square test reported as “two-tailed”	Chi-square is intrinsically upper-tail	Compute 1 - CDF, ignore the two-tailed prompt
Welch's T p slightly different from a pooled-T calculator	Welch uses Welch-Satterthwaite df (often non-integer)	Use the non-integer df your stats package reported
Reported p = 0.0000 in a table	Rounded display, real value below the precision floor	Switch to < 0.0001 or report exp(...) from raw
p = 1.000 returned for a tiny statistic	Statistic in the wrong tail for the test you specified	Flip to the opposite one-tailed test, or use two-tailed
df = 0 returns NaN / inf	Distribution undefined at df = 0	Recompute df; one-sample T needs n - 1, not n
Bonferroni-corrected p > 1	Manual p * k exceeded 1	Cap at 1: min(p * k, 1)

Precision and reporting conventions

Convention	Format	Where used
Exact value, three significant figures	p = 0.0312	APA 7th edition, most journals
Below display floor	p < 0.0001	When the real value is smaller than reportable precision
Star notation (avoid in modern reports)	p < 0.05, p < 0.01, **p < 0.001	Older social-science tables only
Cohen's d alongside p	t(18) = 2.05, p = 0.055, d = 0.96	Required by most psychology journals
Effect size with confidence interval	r = 0.42, 95% CI [0.18, 0.61], p = 0.003	Preferred over star notation in clinical research

Most journals dropped star formats between 2015 and 2020 in favor of exact values plus effect sizes. APA 7 explicitly requires the exact value to three significant figures when feasible.

Related concepts

CDF, SF, PPF: Cumulative distribution function, survival function (1 - CDF, computed directly for tail precision), and percent-point function (the inverse CDF that returns the critical value for a given alpha).
Bonferroni correction: Divide alpha by the number of tests when running multiple comparisons. The reported p stays unchanged; only the threshold moves. Cap corrected p-values at 1.
Holm-Bonferroni and Benjamini-Hochberg: Less conservative multiple-comparison corrections. Standard in genomics, neuroimaging, and any high-throughput testing pipeline.
TOST (two one-sided tests): Equivalence and non-inferiority testing reverses the null hypothesis. Run in scipy (statsmodels.stats.weightstats) or R (equivalence package); a generic p-value calculator does not compute these directly.

P-Value Calculator — Compute the P-Value from a Test Statistic or Score