of Peter Nagy

Biologists, medical doctors and students in these branches of science typically grapple with statistics, but finally many of them reach a basic level of understanding of “significance” and “p-values”. The bad news is that neither of these concepts characterizes sufficiently the reliability of our statistical decisions. This tutorial is intended for those practicing biologists and medical doctors who are interested in the limitations of simply reporting significance or p values. This text is not a comprehensive introduction to hypothesis testing, and basic level of understanding of its principles is assumed. Practical, how-to sections are written on a gray background. At the end of the tutorial, a brief summary without any theory is provided.Download the PDF file here:

It is often overlooked that Excel can
perform most statistical calculations ordinary biologists need. I have
created an Excel macro-enabled workbook which will do all the
statistical calculations you probably need if you are a biologist.

The capabilities of the program include:

- descriptive statistics (mean, SD, SEM, median, mode, skewness, kurtosis)
- normality tests
- calculations with the normal, binomial and Poisson distributions
- z-test, Student's t-tests, Welch test, F test
- ANOVA (1-, 2- and 3-way, repeated-measures ANOVA with one factor)
- Levene's test
- non-parametric tests:Wilcoxon test, sign test, median test, Mann-Whitney test
- chi2 test of independence
- Kolmogorov-Smirnov test
- tests for populations proportions
- Kaplan-Meier logrank test
- linear and polynomial regression with p value estimations, linear regression on ranks (Spearman)
- Deming linear regression (when observations of both the X and Y variables are associated with error)
- general purpose fitting

- false discovery rate calculation including the Benjamini-Hochberg and the Storey methods

Download the Excel workbook here:

Peter_ManyStatProbes_with_Excel.xlsm

The workbook

- requires Excel 2010 or above, and the Solver Add-in.
- automatically upgrades itself.

The Matlab program anovanFromSumStat can
perform one-way, two-way, ... n-way ANOVA on the main and interaction
effects when only summary statistics (mean, SD and size of each group)
is available.

The program runs in four different modes depending on the first argument:

- anovaArray=anovanFromSumStat('gen'): it will generate the array containing the means, SDs and size of each group.
- anovaArray=anovanFromSumStat('regen',anovaArray):it will modify the anovaArray created using the 'gen' option.
- varargout=anovanFromSumStat('calc',anovaArray): it will perform ANOVA with the array created in the previous step.
- anovanFromSumStat('ver'): version of the program is displayed.

Help is available when typing 'help anovanFromSumStat' at the Matlab command prompt.

Download the Matlab P-file here:

anovanFromSumStat.p

The program is also available on MatlabCentral:

https://www.mathworks.com/matlabcentral/fileexchange/41036-n-way-anova-from-summary-statistics

The Matlab program determines the false discovery rate in a single comparison (t-test).

More description about the program and the principles it is based on is available in this tutorial:

A Practical Guide to Significance Beyond p-values

Download the Matlab M-file here:

fdrEstimation.m

If
an investigation requires multiple statistical tests, the probability
of reaching false discoveries can be frighteningly high. This Matlab
program performs correction according to the Benjamini-Hochberg and the
Storey methods. More description about the program and the principles it
is based on is available in this tutorial:

A Practical Guide to Significance Beyond p-values

Download the Matlab M-file here:

correctFDR.m

The
power of a statistical test is the probability that the test will lead
to the rejection of the null hypothesis given it is indeed false. The
power can be calculated if the effect size is known. More description
about the program and the principles it is based on is available in this
tutorial:

Practical Guide to Significance Beyond p-values

Download the Matlab M-file here:

determinePowerTtest.m

It
can often be estimated how large an effect is expected in an
investigation. In order for this effect to be detectable in a
statistical test, a certain minimum sample size is required, which is
determined by this Matlab program. More description about the program
and the principles it is based on is available in this tutorial:

A Practical Guide to Significance Beyond p-values

Download the Matlab M-file here:

sampleSizeForTtest.m