# working through the Wikipedia ANOVA example in Processing using the Papaya library

There’s a worked one-way ANOVA example on Wikipedia and I used it as a template to work out the calculations in Processing. The reason I did this, beyond poops and giggles, is because I was having difficulty getting to grips with the ANOVA example in the Papaya library and I wanted to work through an example with pre-determined results to validate each step of the calculations. It would be great if I could refine the process to handle unbalanced data, to include pre-processing such as transforms, pre-tests for homoscedasticity and normality (although ANOVA is relatively robust towards non-normally distributed data), automated plotting of the data and further post-hoc testing.

/* one-way ANOVA example

This sketch performs a 1-way ANOVA on a CSV file of data copied from the worked 1-way ANOVA example on Wikipedia. The whole process is based upon that worked example.
https://en.wikipedia.org/wiki/F-test#One-way_ANOVA_example

The table of example data needs to be saved into a CSV file called test1.csv and placed in the sketch folder.
You will also need to install the Papaya statistics library created by Adila Faruk.

AUT School of Applied Science
June 2015
drchrispook@gmail.com

released under GPLv3 licence – https://gnu.org/licenses/quick-guide-gplv3.html
*/

import papaya.*;
Table table;
int groups = table.getColumnCount();
println(“groups: ” + groups);
int n = table.getRowCount();
println(“rows: ” + n);
float[] groupMeans = new float[groups];
// Step 1: Calculate the mean within each group
for(int i = 0; i < groups; i++) {
float sum = 0;
for(int e = 0; e < n; e++) {
sum = sum + table.getFloat(e, i);
}
println(“group ” + (i +1) + ” mean = ” + (sum / n));
groupMeans[i] = sum /n;
}
// Step 2: Calculate the overall mean
float overallMean = 0;
for(int i = 0; i < groups; i++) {
overallMean = overallMean + groupMeans[i];
}
overallMean = overallMean / groups;
println(“overall mean = ” + overallMean);
// Step 3: Calculate the “between-group” sum of squared differences:
float Sb = 0;
for(int i = 0; i < groups; i++) {
Sb = Sb + n * sq(groupMeans[i] – overallMean);
}
println(“between-group sum of squared differences (Sb) = ” + Sb);
// between-group degrees of freedom is one less than the number of groups
int fb = groups -1;
println(“degrees of freedom between groups (fb) = ” + fb);

// between-group mean square value
int MSb = int(Sb) / fb;
println(“between-group mean square value (MSb) = ” + MSb);
// Step 4: Calculate the “within-group” sum of squares
float Sw = 0;
for(int i = 0; i < groups; i++) {
for(int e = 0; e < n; e++) {
Sw = Sw + sq(table.getFloat(e, i)-groupMeans[i]);
}
}
println(“within-group sum of squares (Sw) = ” + Sw);

// within-group degrees of freedom
int fw = groups * (n-1);
println(“within-group degrees of freedom (fw) = ” + fw);

// within-group mean square value
float MSw = Sw / fw;
println(“within-group mean square value (MSw) = ” + MSw);
// F-ratio
float F = MSb / MSw;
println(“F-ratio (F) = ” + F);

// area under the F density function from F to infinity
// this is the really important bit where you need to quantify the area under the curve of the F function
// this gives you your P-value estimate
println(“F” + fb + “,” + fw + ” = ” + F + “, P = ” + Probability.fcdfComplemented(F, fb, fw));
//println(Probability.finv(0.05, fb, fw));

exit();