Results

ANOVA

The output below (ANOVA - exam) is the main ANOVA summary table (note that I selected Welch test to have more robustness against possible unequal variances). We should routinely look at the Fs. Assuming we’re using a 0.05 criterion for significance, because the observed significance value is less than 0.05 we can say that there was a significant effect of teaching style on exam marks. This effect was fairly large, 𝜔2 = 0.57 [0.29, 0.725]. At this stage we do not know exactly what the effect of the teaching style was (we don’t know which groups differed). However, I specified contrasts to test the specific hypotheses in the question...

ANOVA - exam
95% CI for ω²
Homogeneity Correction Cases Sum of Squares df Mean Square F p ω² Lower Upper
Welch group 1205.067 2.000 602.533 32.235 < .001 0.572 0.290 0.725
Residuals 774.400 17.336 44.670  
Note.  Type III Sum of Squares

Descriptives

Descriptives - exam
group N Mean SD SE Coefficient of variation
Punish 10 50.000 4.137 1.308 0.083
Indifferent 10 56.000 7.102 2.246 0.127
Reward 10 65.400 4.300 1.360 0.066

Contrast Tables

The next part of the output shows the contrasts results, including the Custom contrast setup I used. The first contrast compares reward (-1) against punishment and indifference (both coded with 0.5). The second contrast compares punishment (coded with 1) against indifference (coded with −1). Note that the codes for each contrast sum to zero, and that in contrast 2, reward has been coded with a 0 because it is excluded from that contrast.


The t-test for the first contrast tells us that reward was significantly different from punishment and indifference (it's significantly different because the value in the column labelled p is less than our criterion of 0.05). Looking at the direction of the means, this contrast suggests that the average mark after reward was significantly higher than the average mark for punishment and indifference combined. This is a massive (i.e., so big that if these were real data I'd be incredibly suspicious) effect: d = -2.32 [-3.34, -1.29].


The second contrast (together with the descriptive statistics) tells us that the marks after punishment were significantly lower than after indifference (again, significantly different because the value in the column labelled p is less than our criterion of 0.05). This effect is also very large, d = -1.12 [-2.09, -0.15]. As such we could conclude that reward produces significantly better exam grades than punishment and indifference, and that punishment produces significantly worse exam marks than indifference. In short, lecturers should reward their students, not punish them.

Custom Contrast - group
95% CI for Mean Difference 95% CI for Cohen's d
Comparison Estimate Lower Upper SE df t p Cohen's d Lower Upper
1 -12.400 -16.656 -8.144 2.074 27 -5.978 < .001 -2.315 -3.340 -1.291
2 -6.000 -10.914 -1.086 2.395 27 -2.505 0.019 -1.120 -2.090 -0.151
Custom Contrast Coefficients - group
group Comparison 1 Comparison 2
Punish 0.500 1.000
Indifferent 0.500 -1.000
Reward -1.000 0.000