working through the Wikipedia ANOVA example in Processing using the Papaya library

There’s a worked one-way ANOVA example on Wikipedia and I used it as a template to work out the calculations in Processing. The reason I did this, beyond poops and giggles, is because I was having difficulty getting to grips with the ANOVA example in the Papaya library and I wanted to work through an example with pre-determined results to validate each step of the calculations. It would be great if I could refine the process to handle unbalanced data, to include pre-processing such as transforms, pre-tests for homoscedasticity and normality (although ANOVA is relatively robust towards non-normally distributed data), automated plotting of the data and further post-hoc testing.

/* one-way ANOVA example

This sketch performs a 1-way ANOVA on a CSV file of data copied from the worked 1-way ANOVA example on Wikipedia. The whole process is based upon that worked example.
https://en.wikipedia.org/wiki/F-test#One-way_ANOVA_example

The table of example data needs to be saved into a CSV file called test1.csv and placed in the sketch folder.
You will also need to install the Papaya statistics library created by Adila Faruk.
http://adilapapaya.com/papayastatistics/

AUT School of Applied Science
June 2015
drchrispook@gmail.com

released under GPLv3 licence – https://gnu.org/licenses/quick-guide-gplv3.html
*/

import papaya.*;
Table table;
table = loadTable(“test1.csv”, “header”);
int groups = table.getColumnCount();
println(“groups: ” + groups);
int n = table.getRowCount();
println(“rows: ” + n);
float[] groupMeans = new float[groups];
// Step 1: Calculate the mean within each group
for(int i = 0; i < groups; i++) {
float sum = 0;
for(int e = 0; e < n; e++) {
sum = sum + table.getFloat(e, i);
}
println(“group ” + (i +1) + ” mean = ” + (sum / n));
groupMeans[i] = sum /n;
}
// Step 2: Calculate the overall mean
float overallMean = 0;
for(int i = 0; i < groups; i++) {
overallMean = overallMean + groupMeans[i];
}
overallMean = overallMean / groups;
println(“overall mean = ” + overallMean);
// Step 3: Calculate the “between-group” sum of squared differences:
float Sb = 0;
for(int i = 0; i < groups; i++) {
Sb = Sb + n * sq(groupMeans[i] – overallMean);
}
println(“between-group sum of squared differences (Sb) = ” + Sb);
// between-group degrees of freedom is one less than the number of groups
int fb = groups -1;
println(“degrees of freedom between groups (fb) = ” + fb);

// between-group mean square value
int MSb = int(Sb) / fb;
println(“between-group mean square value (MSb) = ” + MSb);
// Step 4: Calculate the “within-group” sum of squares
float Sw = 0;
for(int i = 0; i < groups; i++) {
for(int e = 0; e < n; e++) {
Sw = Sw + sq(table.getFloat(e, i)-groupMeans[i]);
}
}
println(“within-group sum of squares (Sw) = ” + Sw);

// within-group degrees of freedom
int fw = groups * (n-1);
println(“within-group degrees of freedom (fw) = ” + fw);

// within-group mean square value
float MSw = Sw / fw;
println(“within-group mean square value (MSw) = ” + MSw);
// F-ratio
float F = MSb / MSw;
println(“F-ratio (F) = ” + F);

// area under the F density function from F to infinity
// this is the really important bit where you need to quantify the area under the curve of the F function
// this gives you your P-value estimate
println(“F” + fb + “,” + fw + ” = ” + F + “, P = ” + Probability.fcdfComplemented(F, fb, fw));
//println(Probability.finv(0.05, fb, fw));

exit();

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.