hours
|
essay
|
||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
First class | Upper second class | Lower second class | Third class | First class | Upper second class | Lower second class | Third class | ||||||||||
Valid | 10 | 23 | 10 | 2 | 10 | 23 | 10 | 2 | |||||||||
Mean | 8.827 | 8.669 | 7.680 | 5.628 | 72.332 | 63.631 | 57.050 | 48.960 | |||||||||
Std. Deviation | 3.932 | 2.539 | 1.537 | 0.814 | 3.488 | 2.797 | 3.451 | 1.034 | |||||||||
We’re interested in looking at the relationship between hours spent on an essay and the grade obtained. We could create a scatterplot of hours spent on the essay (x-axis) and essay mark (%) (y-axis). I’ve chosen to highlight the degree classification grades using different colours. The resulting scatterplot is below.
hours | essay | ||||
---|---|---|---|---|---|
Valid | 45 | 45 | |||
Mean | 8.349 | 63.450 | |||
Std. Deviation | 2.725 | 6.757 | |||
The Q-Q plots both look fairly normal (below).
Variable | essay | hours | |||||
---|---|---|---|---|---|---|---|
1. essay | n | — | |||||
Pearson's r | — | ||||||
p-value | — | ||||||
Lower 95% CI | — | ||||||
Upper 95% CI | — | ||||||
2. hours | n | 45 | — | ||||
Pearson's r | 0.267 | — | |||||
p-value | 0.038 | — | |||||
Lower 95% CI | -0.074 | — | |||||
Upper 95% CI | 0.532 | — | |||||
Note. All tests one-tailed, for positive correlation. | |||||||
Note. Confidence intervals based on 1000 bootstrap replicates. |
The results in the Pearson's Correlations table above indicate that the relationship between time spent writing an essay (hours) and grade awarded (essay %) was not significant, Pearson’s r = 0.267, 95% BCa CI [-0.061, 0.532], p = 0.077.
The second part of the question asks us to do the same analysis but when the percentages are recoded into degree classifications. The degree classifications are ordinal data (not interval): they are ordered categories. So we shouldn’t use Pearson’s test statistic, but Spearman’s and Kendall’s ones instead.
Variable | hours | grade | |||||
---|---|---|---|---|---|---|---|
1. hours | n | — | |||||
Spearman's rho | — | ||||||
p-value | — | ||||||
Kendall's Tau B | — | ||||||
p-value | — | ||||||
2. grade | n | 45 | — | ||||
Spearman's rho | -0.193 | — | |||||
p-value | 0.204 | — | |||||
Kendall's Tau B | -0.158 | — | |||||
p-value | 0.178 | — | |||||
In both cases the correlation is non-significant. There was no significant relationship between degree grade classification for an essay and the time spent doing it, 𝜌= -0.193, p = 0.204, and 𝜏= –0.158, p = 0.178. Note that the direction of the relationship has reversed. This has happened because the essay marks were recoded as 1 (first), 2 (upper second), 3 (lower second), and 4 (third), so high grades were represented by low numbers. This example illustrates one of the benefits of not taking continuous data (like percentages) and transforming them into categorical data: when you do, you lose information and often statistical power!