There’s a worked one-way ANOVA example on Wikipedia and I used it as a template to work out the calculations in Processing. The reason I did this, beyond poops and giggles, is because I was having difficulty getting to grips with the ANOVA example in the Papaya library and I wanted to work through an example with pre-determined results to validate each step of the calculations. It would be great if I could refine the process to handle unbalanced data, to include pre-processing such as transforms, pre-tests for homoscedasticity and normality (although ANOVA is relatively robust towards non-normally distributed data), automated plotting of the data and further post-hoc testing.

/* one-way ANOVA example

This sketch performs a 1-way ANOVA on a CSV file of data copied from the worked 1-way ANOVA example on Wikipedia. The whole process is based upon that worked example.

https://en.wikipedia.org/wiki/F-test#One-way_ANOVA_exampleThe table of example data needs to be saved into a CSV file called test1.csv and placed in the sketch folder.

You will also need to install the Papaya statistics library created by Adila Faruk.

http://adilapapaya.com/papayastatistics/AUT School of Applied Science

June 2015

drchrispook@gmail.comreleased under GPLv3 licence – https://gnu.org/licenses/quick-guide-gplv3.html

*/import papaya.*;

Table table;

table = loadTable(“test1.csv”, “header”);

int groups = table.getColumnCount();

println(“groups: ” + groups);

int n = table.getRowCount();

println(“rows: ” + n);

float[] groupMeans = new float[groups];

// Step 1: Calculate the mean within each group

for(int i = 0; i < groups; i++) {

float sum = 0;

for(int e = 0; e < n; e++) {

sum = sum + table.getFloat(e, i);

}

println(“group ” + (i +1) + ” mean = ” + (sum / n));

groupMeans[i] = sum /n;

}

// Step 2: Calculate the overall mean

float overallMean = 0;

for(int i = 0; i < groups; i++) {

overallMean = overallMean + groupMeans[i];

}

overallMean = overallMean / groups;

println(“overall mean = ” + overallMean);

// Step 3: Calculate the “between-group” sum of squared differences:

float Sb = 0;

for(int i = 0; i < groups; i++) {

Sb = Sb + n * sq(groupMeans[i] – overallMean);

}

println(“between-group sum of squared differences (Sb) = ” + Sb);

// between-group degrees of freedom is one less than the number of groups

int fb = groups -1;

println(“degrees of freedom between groups (fb) = ” + fb);// between-group mean square value

int MSb = int(Sb) / fb;

println(“between-group mean square value (MSb) = ” + MSb);

// Step 4: Calculate the “within-group” sum of squares

float Sw = 0;

for(int i = 0; i < groups; i++) {

for(int e = 0; e < n; e++) {

Sw = Sw + sq(table.getFloat(e, i)-groupMeans[i]);

}

}

println(“within-group sum of squares (Sw) = ” + Sw);// within-group degrees of freedom

int fw = groups * (n-1);

println(“within-group degrees of freedom (fw) = ” + fw);// within-group mean square value

float MSw = Sw / fw;

println(“within-group mean square value (MSw) = ” + MSw);

// F-ratio

float F = MSb / MSw;

println(“F-ratio (F) = ” + F);// area under the F density function from F to infinity

// this is the really important bit where you need to quantify the area under the curve of the F function

// this gives you your P-value estimate

println(“F” + fb + “,” + fw + ” = ” + F + “, P = ” + Probability.fcdfComplemented(F, fb, fw));

//println(Probability.finv(0.05, fb, fw));exit();