2-way Contingency Table Analysis

Revised: 07/23/2013 -- added NNM -- the Number Needed to Mis-diagnose (thanks to Farrokh Habibzadeh)

This page computes various statistics from a 2-by-2 table. It will calculate the Yates-corrected chi-square, the Mantel-Haenszel chi-square,  the Fisher Exact Test, and other indices relevant to various special kinds of 2-by-2 tables:

  1. analysis of risk factors for unfavorable outcomes (odds ratio, relative risk, difference in proportions,  absolute and relative reduction in risk, number needed to treat)
  2. analysis of the effectiveness of a diagnostic criterion for some condition (sensitivity, specificity, pos & neg predictive values, pos & neg likelihood ratios, diagnostic  and error odds ratios)
  3. measures of inter-rater reliability (% correct or consistent, mis-classification rate, kappa, Forbes' NMI)
  4. other measures of association (contingency coefficient, Cramer's phi coefficient, Yule's Q)

Many of these concepts are explained in detail in an online Evidence-based Medicine Glossary. For more information about a particular index, click on the <more info> link for that index.

Confidence intervals for the estimated parameters are computed by a general method (based on "constant chi-square boundaries") given in: Statistical Methods for Rates and Proportions (2nd Ed.) Section 5.6,  by Joseph L. Fleiss (Pub: John Wiley & Sons, New York, 1981). This method is also described in Numerical Recipes in C (2nd Ed.) Section 15.6, by William H. Press et al. (Pub: Cambridge University Press, Cambridge UK, 1992)


Enter numbers into the four cells below. Make sure that the row and column totals add up correctly. Then click the Compute button.

Warning: Do not enter cell counts with a leading zero! That is, if a cell count is 34, enter it as 34, not as 034.  Some browsers will mis-interpret some numbers entered with leading zeros, and will produce wrong results (with no warning message). For more information about this, and for other things to be aware of before using this page for the first time, make sure you read the JavaStat user interface guidelines.

Observed Contingency Table

*
Outcome Occurred
Outcome did not Occur
Totals
Risk Factor Present
or Dx Test Positive

= a

= b = r1
Risk Factor Absent
or Dx Test Negative
= c = d = r2
Totals = c1 = c2 = t

Confidence Level: %



Chi-Square Tests

Type of Test

Chi Square

d.f.

p-value

Pearson Uncorrected

Yates Corrected

Mantel-Haenszel

Fisher Exact Test

Type of comparison (Alternate Hypothesis)

p-value

Two-tailed (to test if the Odds Ratio is significantly different from 1):
If you don't know which Fisher Exact p-value to use, use this one.
This is the p-value produced by SAS, SPSS, R, and other software.

Left-tailed (to test if the Odds Ratio is significantly less than 1):

Right-tailed (to test if the Odds Ratio is significantly greater than 1):

Two-tailed p-value calculated as described in Rosner's book:
(2 times whichever is smallest: left-tail, right-tail, or 0.5)
It tends to agree closely with Yates Chi-Square p-value.

Probability of getting exactly the observed table:
(This is not really a p-value; don't use this as a significance test.)

Verification of computational accuracy:
(This number should be very close to 1.0; the closer, the better.)


Quantities derived from a 2-by-2 table
Quantities Derived from the 2-by-2 Contingency Table
Value
Odds Ratio (OR) = (a/b)/(c/d);
Relative Risk (RR) = (a/r1)/(c/r2);
Kappa
Overall Fraction Correct = (a+d)/t ; (often referred to simply as "Accuracy")
Mis-classification Rate, = 1 - Overall Fraction Correct;
Sensitivity = a/c1; (use exact Binomial confidence intervals instead of these)
Specificity = d/c2; (use exact Binomial confidence intervals instead of these)
Positive Predictive Value (PPV) = a/r1; (use exact Binomial confidence intervals instead of these)
Negative Predictive Value (NPV) = d/r2; (use exact Binomial confidence intervals instead of these)
Difference in Proportions (DP) = a/r1 - c/r2;
Number Needed to Treat (NNT) = 1 / absolute value of DP; which = 1 / absolute value of ARR;
Absolute Risk Reduction (ARR) = c/r2 - a/r1; which = - DP
Relative Risk Reduction (RRR) = ARR/(c/r2); <more info>
Positive Likelihood Ratio (+LR) = Sensitivity / (1 - Specificity);
Negative Likelihood Ratio (-LR) = (1 - Sensitivity) / Specificity;
Diagnostic Odds Ratio = (Sensitivity/(1-Sensitivity))/((1-Specificity)/Specificity);
Error Odds Ratio = (Sensitivity/(1-Sensitivity))/(Specificity/(1-Specificity));
Youden's J = Sensitivity + Specificity - 1;
Number Needed to Diagnose (NND) = 1 / (Sensitivity - (1 - Specificity) ) = 1 / (Youden's J); <more info>
Number Needed to Mis-diagnose (NNM) = 1 / ( 1 - Accuracy ); <more info>
Forbes' NMI Index; <more info>
Contingency Coefficient;
Adjusted Contingency Coefficient;
Tetrachoric (terachoric) Correlation Coefficient = Cos( Pi / (1 + Sqrt( OR ) ) );
Phi Coefficient (= Cramer's Phi, and = Cohen's w Index, for 2x2 table);
Yule's Q = (a*d-b*c)/(a*d+b*c) = (OR - 1) / (OR + 1); <more info>
Equitable Threat Score = (a-e)/(a+b+c-e), where e = r1*c1/t; <more info>
Entropy H(r) = - ( (r1/t)log2(r1/t) + (r2/t)log2(r2/t) )
Entropy H(c) = - ( (c1/t)log2(c1/t) + (c2/t)log2(c2/t) )
Entropy H(r,c) = - ( (a/t)log2(a/t) + (b/t)log2(b/t) + (c/t)log2(c/t) + (d/t)log2(d/t) )
Information shared by descriptors r and c: B = H(r) + H(c) - H(r,c)
A = H(r,c) - H(r)
C = H(r,c) - H(c)
Similarity of descriptors r and c: S(r,c) = B / (A + B + C)
Distance between r and c: D(r,c) = (A + C) / (A + B + C)

If you don't see your favorite "quantity" in this list,
drop me a line and let me know how that quantity is calculated from the four cell counts,
and I'll add it to the collection!

Or you can calculate the limits for any derived quantity yourself!  Here's how...

This is the lower limiting table...

And this is the upper limiting table...

If you use these numbers, instead of your observed numbers, in the formula for any derived quantity, you'll get the lower and upper confidence limits for that quantity.

(The row and column sums for these tables are the same as for your observed table.)



Reference: Bernard Rosner, Fundamentals of Biostatistics, 6th Ed., 2006

Return to the Interactive Statistics page or to the JCP Home Page

Send e-mail to John C. Pezzullo at