Results

Logistic Regression

Task 17.1: A ‘display rule’ refers to displaying an appropriate emotion in a situation. For example, if you receive a present that you don’t like, you should smile politely and say ‘Thank you, Auntie Kate, I’ve always wanted a rotting cabbage’; you do not start crying and scream ‘Why did you buy me a rotting cabbage?!’. A psychologist measured children’s understanding of display rules (with a task that they could pass or fail), their age (months), and their ability to understand others’ mental states (‘theory of mind’, measured with a false belief task that they could pass or fail). Can display rule understanding (did the child pass the test: yes/no?) be predicted from theory of mind (did the child pass the false belief, fb, task: yes/no?), age and their interaction? (display.jasp)

Notice that both of the categorical variables have been entered as coding variables: the outcome variable is coded as 1 is having display rule understanding, and 0 represents an absence of display rule understanding. For the false-belief task a similar coding has been used (1 = passed the false-belief task, 0 = failed the false-belief task).

To compare the impact of each predictor separately, go to the Model tab and add two models. Model 0 should not include any predictors, Model 1 should only include fb, Model 2 should include fb and age, and finally Model 3 should include fb, age, and their interaction. To input the interaction, hold Ctrl (⌘ on a mac) while selecting both predictors, then add them to Model 3.

Model Summary - display
Model	Deviance	AIC	BIC	df	ΔΧ²	p	McFadden R²	Nagelkerke R²	Tjur R²	Cox & Snell R²
M₀	96.124	98.124	100.373	69			0.000		0.000
M₁	70.042	74.042	78.539	68	26.083	< .001	0.271	0.417	0.352	0.311
M₂	67.757	73.757	80.502	67	2.285	0.131	0.295	0.446	0.373	0.333
M₃	67.634	75.634	84.628	66	0.123	0.726	0.296	0.448	0.374	0.334

Note. M₁ includes fb
Note. M₂ includes fb, age
Note. M₃ includes age, fb, age:fb

The log-likelihood of the baseline model (M₀) is 96.124. This represents the fit of the model when including only the constant. Initially every child is predicted to belong to the category in which most observed cases fell. In this example there were 39 children who had display rule understanding and only 31 who did not. Therefore, of the two available options it is better to predict that all children had display rule understanding because this results in a greater number of correct predictions. Overall, the model correctly classifies 55.7% of children (since 55.7% of the children actually had display rule understanding).

Coefficients
						Wald Test			95% Confidence interval (odds ratio scale)
Model		Estimate	Standard Error	Odds Ratio	z	Wald Statistic	df	p	Lower bound	Upper bound
M₀	(Intercept)	0.230	0.241	1.258	0.954	0.910	1	0.340	0.785	2.016
M₁	(Intercept)	-1.344	0.458	0.261	-2.931	8.592	1	0.003	0.106	0.641
	fb (Yes)	2.761	0.605	15.812	4.567	20.857	1	< .001	4.835	51.710
M₂	(Intercept)	-2.496	0.918	0.082	-2.718	7.389	1	0.007	0.014	0.498
	fb (Yes)	2.165	0.696	8.715	3.112	9.683	1	0.002	2.229	34.081
	age	0.032	0.021	1.032	1.490	2.221	1	0.136	0.990	1.077
M₃	(Intercept)	-2.948	1.596	0.052	-1.847	3.410	1	0.065	0.002	1.198
	age	0.044	0.041	1.045	1.085	1.176	1	0.278	0.965	1.132
	fb (Yes)	2.858	2.105	17.422	1.357	1.843	1	0.175	0.281	1079.156
	age * fb (Yes)	-0.017	0.048	0.983	-0.351	0.123	1	0.726	0.896	1.080

Note. display level 'Yes' coded as class 1.

In M₁, false-belief understanding (fb) is added to the model as a predictor. As such, a child is now classified as having display rule understanding based on whether they passed or failed the false-belief task. The output above shows summary statistics about the new model. The overall fit of the new model is assessed using the Deviance. Remember that large values of the deviance statistic indicate poorly fitting statistical models.

If fb has improved the fit of the model then the value of deviance should be less than the value when only the constant was included (because lower values of deviance indicate better fit). When only the constant was included, deviance = 96.124, but now fb has been included this value has been reduced to 70.042. This reduction tells us that the model is better at predicting display rule understanding than it was before fb was added. We can assess the significance of the change in a model by taking the deviance of the new model and subtracting the deviance of the baseline model from it. The value of the model chi-square statistic ( $$ ) works on this principle and is, therefore, equal to deviance with fb included minus the value of deviance when only the constant was in the model ( $$ 96.124 − 70.042 = 26.083). This value has a chi-square distribution. In this example, the value is significant at the p < .001 level and so we can say that overall the model predicts display rule understanding significantly better than with fb included than with only the constant included.

The Coefficients table tells us the estimates for the coefficients for the predictors included in the M₁ model (namely, fb and the constant). The coefficient represents the change in the logit of the outcome variable associated with a one-unit change in the predictor variable. The logit of the outcome is the natural logarithm of the odds of Y occurring.

The Wald statistic has a chi-square distribution and tells us whether the b coefficient for that predictor is significantly different from zero. If the coefficient is significantly different from zero then we can assume that the predictor is making a significant contribution to the prediction of the outcome (Y). For these data it seems to indicate that false-belief understanding is a significant predictor of display rule understanding (note the significance of the Wald statistic is less than .001).

We can also look at the Model Summary table to see if the other predictors (age and the interaction effect) add predictive value to the model. First, we have M₃, which reduced the Deviance by another 2.3, compared to M₂. We can use this result ( $$ ) combined with its degrees of freedom (68-67 = 1), obtain the p-value of 0.131 that is listed in the table. The decrease in Deviance is therefore not significant, which indicates that age does not improve our model predictions to the extent that it’s worth including in the model. Here we also see BIC shining. The BIC is similar to the Deviance, in that a lower number indicates better model fit, but the BIC has some extra heavy penalty for model complexity (i.e., the number of predictors). That is a nice feature because in model selection we generally want to have a built-in Occam’s razor, where we want to predict as well as possible, while using as few predictors as possible. Since M₃ has an additional predictor, while not improving the predictions much, the BIC for M₃ is actually higher than that of M₂ - bad news for M₃! The same goes for M₄, where adding the interaction does not improve model fit significantly ( $$ (1) = 0.123, p = .726), and where the BIC is even higher than that of M₃.

Assuming we are content that the model is accurate and that false-belief understanding has some substantive significance, then we could conclude that false-belief understanding is the single best predictor of display rule understanding. Furthermore, age and the interaction of age and false-belief understanding did not significantly predict display rule understanding. This conclusion is fine in itself, but to be sure that the model is a good one, it is important to examine the residuals, which brings us nicely onto the next task.

Performance Diagnostics

Confusion matrix
	Predicted
Observed	No	Yes	% Correct
No	23	8	74.194
Yes	6	33	84.615
Overall % Correct			80.000

Note. The cut-off value is set to 0.5

The confusion matrix shows the predictions of a child passing the display rule task, based on M₄. If the model perfectly fits the data, then all of the cases should fall along the diagonal (i.e., we predict a child to pass the display test and they actually did, or we predict they fail the display test and they actually did). In this example, our model correctly classified 80% of the observations.

Logistic Regression: Influential Cases

Task 17.2: Are there any influential cases or outliers in the model for Task 1?

First, we need to make sure we obtain output for the best model, based on the adventure we had in the previous exercise. This was the model with only fb, so make sure that you now have a logistic regression with the baseline model (M₀) and a single alternative model (M₁) that has the predictor in it. To obtain information about influential cases, we need to look at the Casewise diagnostics option, which can be found under the Statistics tab. To detect outliers, you can use a threshold for the standardized residual (anything greater than 3 is worth investigating) or Cook's distance (values close to, or greater than, 1 are usually problematic). Luckily, there do not seem to be any observations that cross the problematic thresholds, so we don't have reason to be concerned about outliers.

Model Summary - display
Model	Deviance	AIC	BIC	df	ΔΧ²	p	McFadden R²	Nagelkerke R²	Tjur R²	Cox & Snell R²
M₀	96.124	98.124	100.373	69			0.000		0.000
M₁	70.042	74.042	78.539	68	26.083	< .001	0.271	0.417	0.352	0.311

Note. M₁ includes fb

Coefficients
						Wald Test			95% Confidence interval (odds ratio scale)
Model		Estimate	Standard Error	Odds Ratio	z	Wald Statistic	df	p	Lower bound	Upper bound
M₀	(Intercept)	0.230	0.241	1.258	0.954	0.910	1	0.340	0.785	2.016
M₁	(Intercept)	-1.344	0.458	0.261	-2.931	8.592	1	0.003	0.106	0.641
	fb (Yes)	2.761	0.605	15.812	4.567	20.857	1	< .001	4.835	51.710

Note. display level 'Yes' coded as class 1.

Influential Cases
Case Number	Std. Residual	display	Predicted Value	Residual	Cook's Distance	Leverage
.	.	.	.	.	.	.

Note. No influential cases found.