The additional restrictions on repeated measures analysis implies more extensive checking of missing data. Here we document the checks and expected results.
There are two broad classes of missing data - balanced and unbalanced. The data are balanced when there are equal numbers of observations for all treatment-time combinations. Thus, when all of a replicate or a treatment in the base design, or all of a selected assessment column, are missing, there can still be balance among the observations. In the balanced case, the analysis can proceed as a standard split-plot.
In the unbalanced cases, variance terms (the standard deviations for different error strata) necessary for mean comparisons cannot be computed from their expected mean squares. Instead, we use MINQUE estimates for variance components and report the weighted least square means.
The balanced cases are
Unbalanced cases are
In the balanced case, we may write the expected mean squares for a split-plot AOV as
Source | DF | MS | Expected Mean Squares |
---|---|---|---|
Replicate | \((r-1)\) | MSR | \(\sigma^2 + b\sigma_{plot}^2 + \sigma_{rep}^2\) |
Treatment | \((a-1)\) | MSA | \(\sigma^2 + b\sigma_{plot}^2 + \theta_{trt}^2\) |
Error Treatment | \(a(r-1)\) | MSP | \(\sigma^2 + b\sigma_{plot}^2\) |
Time | \(b-1\) | MSB | \(\sigma^2 + \theta_{time}^2\) |
Treatment x Time | \(a(b-1)\) | MSAB | \(\sigma^2 + \theta_{trt \times time}^2\) |
Residual | \(a(r-1)(b-1)\) | MSE | \(\sigma^2\) |
For the purposes of this document, we use the simplest analysis - split-plot with no degrees of freedom correction. Use this Report Set. To duplicate the graphs, you’ll want this Graph Options file.
ARM Trial - use G-All7_SDTR_Inoc_Split.dat0
from the Tutorial study list.
ARM Report
SAS File
SAS Report
We get the error term for treatments from the AOV table in the row labeled Treatment Error
as 10.267020; the error for rating date and for the treatment by rating date interaction (rating date within treatment) is residual error, 1.036376. Denote these as ERRORA
and ERRORB
. Denote the number of treatments as \(a\) and the number or rating dates as \(b\). We then calculate standard errors for the difference between two means by
\[ s.e. = \sqrt{\frac{2 ERRORA}{rb}} \]
\[ s.e. = \sqrt{\frac{2 ERRORB}{ra}} \]
\[ s.e. = \sqrt{\frac{2 ERRORB}{r}} \]
\[ s.e. = \sqrt{\frac{2[(b-1) ERRORB + ERRORA}{rb}} \]
For each mean comparison \(\bar{y_i}\) vs \(\bar{y_j}\), using the standard error term, we calculate the test statistic \[ q_{i,j} = \frac{\bar{y_i}-\bar{y_j}}{SE} \]
In this example, we use the quantile for Tukey’s \(q_{1-\alpha, k, n-k}/\sqrt{2}\) for \(k\) means and \(n-k\) degrees of freedom. These values are produced as intermediate tables and not shown on the report. We calculate critical values for \(|\bar{y_i}-\bar{y_j}|\) as \(HSD = SE * q_{\alpha, k, n-k}/\sqrt{2}\)
Statistic | Treatment | Time | Interaction Same Treatment Different Time |
Interaction Different Treatment Different Time |
---|---|---|---|---|
\(n\) | 24 | 24 | 4 | 4 |
degrees freedom | 15 | 90 | 90 | 90 |
mean square | 10.588426 | 1.010648 | 1.010648 | 2.606944 |
variance | 1.596296 | 1.010648 | 1.010648 | 1.010648 |
standard error | 0.939345 | 0.290208 | 0.710862 | 1.141697 |
means (\(k\)) | 6 | 6 | 36 | 36 |
\(q\) | 3.248968 | 2.912031 | 3.969909 | 3.969909 |
HSD | 3.051902 | 0.845095 | 2.822055 | 4.532433 |
ARM produces several tables for mean comparisons; these are not reported. Copies of these working tables can be found in the temporary directory during an ARM session.
Compare
to the table in
Compare
to the table in
A repeated measures with missing observations represents the most likely missing data case. In this case, observations are missing at random; observations for a single assessment column, from one plot, at time. All combinations of treatment and time are represented, and all plots have at least one assessment.
This case requires MINQUE variance estimates. Briefly, to compare two treatments at the same time (means from the same assessment column), we use for the error of the difference between means, \[ s.e. = \sqrt{\frac{2 \sigma^2}{r}} \] while comparisons between two treatments at the same or different times,
\[ s.e. = \sqrt{\frac{2 (\sigma^2 + \sigma_{plot}^2)}{r}} \] since observations over of the same treatments at different times come from the same plots, while observations over different treatments must necessarily be taken from different plots, and plots are assumed to be drawn from a random population with effects \(\sim N(0,\sigma_{plot}^2)\)
When data are balanced, we can substitute expected mean squares, giving \[ s.e. = \sqrt{\frac{2 [(b-1)ERRORB + ERRORA]}{rb}} \] When observations are missing, \(r\) and \(b\) do not take integer values and we require other estimates, in this case MINQUE.
When we have balanced data, we can use \(\sqrt{\frac{2 MSP}{rb}}\) as the standard error estimator for the difference between two treatment means. This follows from the more general form \[ s.e. (difference) = s.e. (mean 1) + s.e. (mean 2) = \sqrt{\frac{SD_1}{r_1}+\frac{SD_2}{r_2}} \] since for balanced data, \(r_1 = r_2\) and using pooled error, \(SD_1 = SD_2\). This also implies \(SD_1 = SD_2 = MSP/b\). Suppose treatment 1 were missing a plot. Then we have
\[ s.e. (difference) = s.e. (mean 1) + s.e. (mean 2) = \sqrt{\frac{SD_1}{r-1}+\frac{SD_2}{r}} \]
Here, we are missing all observations for one treatment in one assessment column.
Warnings
Some treatment and rating date means are not estimable due to missing treatment x rating date combinations.
SAS does not report treatment or time least square means from PROC GLM when there is a missing treatment x time combination, so we do not report these means in ARM. SAS does report least square means for the interaction, so ARM does likewise.
We confirm that missing replicates is still a balanced design. However, in this case replicate variance estimates is more negative than the default case. This introduces differences in mean comparisons, when compared to a mixed effects model that does not allow negative variances.