The t-test and ANOVA examine whether group means differ from one another. The t-test compares two groups, while ANOVA can do more than two groups.

The t-test ANOVA have three assumptions: independence assumption (the elements of one sample are not related to those of the other sample), normality assumption (samples are randomly drawn from the normally distributed populations with unknown population means; otherwise the means are no longer best measures of central tendency, thus test will not be valid), and equal variance assumption (the population variances of the two groups are equal)

ANCOVA (analysis of covariance) includes covariates, interval independent variables, in the right-hand side to control their impacts. MANOVA (multivariate analysis of variance) has more than one left-hand side variable.

Analysis

LHS (interval)

RHS (categorical)

Notes

T-test

Single

Single (binary)

One-way

Single

Single

Two-way

Single

Two (multiple)

ANCOVA

Single

Multiple

Covariates

MANOVA

Multiple

Multiple

The following diagram summarizes the t-tes and one-way ANOVA.

SAS has the UNIVARIATE, MEANS, and TTEST procedures for t-test, while SAS ANOVA, GLM, and MIXED procedures conduct ANOVA.

The ANOVA procedure is able to handle balanced data only, but the GLM and MIXED procedures can deal with both balanced and unbalanced data. The t-test and one-way ANOVA do not matter whether data are balanced or not.

STATA has the .ttest, and the .ttesti commands for t-test, and the .anova, and .manova commands conduct ANOVA. Note STATA .glm command is not used for ANOVA.

It is useful to read multiple observations in a data line. Note that @@ is a line holder in SAS.

LIBNAME js 'c:\data\sas';

DATA js.data1;
INPUT group block $ response @@;
DATALINES;
1 A 34.5 1 B 54.5 1 B 25.8 3 C 54.8
2 B 54.8 3 A 15.8 2 C 14.5 2 A 15.1
...
RUN;

/* Data read ******************
1 1 A 34.5
2 1 B 54.5
3 1 B 25.8
...
*******************************/

The DO statement allows to read more complicated data. You may list the particular numbers in the DO statement rather than set a range of values (e.g., DO treatment=1 TO 2;). The @ may not be omitted.
This tip is very useful especially when you type in data for the randomized complete block design (RCB) and the Latin square design (LSD).

DATA js.data2;
DO block=1 TO 3;
DO treatment=1,5;
INPUT response @;
OUTPUT;
END;
END;
DATALINES;
4.91 4.63 4.76 5.04 5.38 6.21
5.60 5.08 4.91 4.63 4.76 5.04
...
RUN;

The MU0 option specifies a value of the null hypothesis. The ALPHA option specifies the significance level. The T option in the MEANS procedure runs the t-test.

PROC UNIVARIATE MU0=0 ALPHA=.01;
VAR response;
RUN;

. ttest response=0, level(99)

PROC UNIVARIATE MU0=10 VARDEF=DF NORMAL ALPHA=.05;
VAR response;
RUN;

. ttest response=10

PROC MEANS T PROBT;
VAR response;
RUN;

. ttest response=0

PROC MEANS MEAN STD STDERR T VARDEF=DF PROBT CLM ALPHA=.01;
VAR response;
RUN;

Paired T-Test

PROC TTEST;
PAIRED pre*post;
RUN;

. ttest pre=post,level(95)

Note that STATA .ttest command does not have the "unpaired" option. SAS PAIRED statement is able to compare multiple pairs.

The TTEST procedure reports two T statistics: one under the equal variance assumption and the other for unequal variance. Users have to check the equal variance test (F test) first. If not rejected, read the T statistic and its p-value of pooled analysis. If rejected, read the T statistic and its p-value of Satterthwaite or Cochran/Cox approximation.

PROC TTEST COCHRAN;
CLASS male;
VAR response;
RUN;

STATA is able to conduct the t-test for two independent samples even When data are arranged in two variables without a group variable. The unpaired option indicates that the two variables are independent, and the welch option asks STATA produces Welch approximation of degree of freedom. Note STATA does not give us Cochran/Cox approximation.

The FREQ statement in the TTEST procedure can handle aggregate data

PROC TTEST H0=5 ALPHA=.01;
CLASS smoke;
VAR lung;
FREQ count;
RUN;

STATA .ttesti command enables you to conduct t-test using aggregated descriptive statistics. The numbers listed are the number of observation, mean, and standard deviation of first sample and of second sample.

This experimental design is often called completely randomized design (CRD). SAS has the ANOVA, GLM (Generalized Linear Model), MIXED Procedures for one-way ANOVA. Their usages are identical.

PROC ANOVA;
CLASS treatment;
MODEL response=treatment;
RUN;

STATA has the .anova and .oneway command for one-way ANOVA.

You may add the MEANS statement in both ANOVA and GLM procedures to compute means of groups and perform multiple comparison tests such as DUNCAN, TUKEY, DUNNETT, and BON.

PROC GLM;
CLASS treatment;
MODEL response=treatment;
MEANS treatment /T DUNCAN;
RUN;

Randomized Complete Block (RCB): Treatments are assigned at random within blocks of adjacent subjects, each treatment once per block. The number of blocks is the number of replications. Any treatment can be adjacent to any other treatment, but not to the same treatment within the block.

Again, the ANOVA, GLM, and MIXED conduct the two-way ANOVA with the identical usage.

PROC GLM;
CLASS treat1 treat2;
MODEL response=treat1 treat2;
RUN;

In the case of the randomized complete block design, you may have one observation in each cell. So, including an interaction term is meaningless, producing awkward results. But it is noteworthy that the sum of squares due to error (SSE) is equivalent to the sum of squares of interaction (SSI).

You may compare group means using the MEANS or the LSMEANS (least squares means) statement. The LSMEANS statement is not available in the ANOVA procedure.

PROC ANOVA;
CLASS treatment block;
MODEL response=treatment block;
MEANS treatment block /TUKEY;
RUN;

PROC GLM;
CLASS treatment block;
MODEL response=treatment block;
LSMEANS treatment block /ADJUST=TUKEY;
RUN;

If there is subsamples, you need to use nested scheme as follows.

PROC GLM;
CLASS treatment sub;
MODEL response=treatment treatment(sub);
RUN;

If there are subsamples (more than one observation in each cell) in a two-way ANOVA, you may consider the interaction effects. This is the two-way factorial design on CRD.

Block1

block2

block3

Treat1

54, 67, 87

57, 67

31, 54, 87, 95

Treat2

35, 67

54, 87, 15, 75, 55

68, 17, 16, 68

Treat3

98, 45, 12, 57, 87

31, 14, 54

24, 87

The interaction is expressed by asterisk (*). The | indicates all possible combinations. Thus, the following procedures return the same result.

PROC ANOVA;
CLASS treatment block;
MODEL response=treatment | block;
RUN;

PROC GLM;
CLASS treatment block;
MODEL response=treatment block treatment*block;
RUN;

You may compare group means using the MEANS or the LSMEANS (least squares means) statement. The LSMEANS statement is not available in the ANOVA procedure.

PROC ANOVA;
CLASS treatment block;
MODEL response=treatment | block;
MEANS treatment block treatment*block/TUKEY;
RUN;

PROC GLM;
CLASS treatment block;
MODEL response=treatment | block;
LSMEANS treatment | block /ADJUST=TUKEY;
RUN;

Two-Way Factorial Design on RCB

PROC GLM;
CLASS treat1 treat2 block;
MODEL response=treat1 treat2 block treat1*treat2;
RUN;

. anova response treatment block treatment*block

Three-Way Factorial Design on RCB

PROC GLM;
CLASS treat1 treat2 treat3 block;
MODEL response=treat1 treat2 block treat1*treat2 treat1*treat3 treat2*treat3 treat1*treat2*treat3;
RUN;

The latin square design (LSD) has the equal number of rows, columns and treatments. Treatments are assigned at random within rows and columns, with each treatment once per row and once per column. Each cell of the squared table has only one observation.
This LSD is useful to control variation in two row and column.

PROC GLM;
CLASS row column treatment;
MODEL response=row column treatment;
RUN;

.anova response row column treat

The degree of freedom of main effects (block, group, and treatment) is r, the number of row or column. The degree of freedom of SSE is (r-1)(r-2). Finally, the degree of freedom of SST is N-1 = r*r-1.

The followings are examples of random effects models using MIXED and GLM.

PROC MIXED;
CLASS treat block;
MODEL response = treatk /SOLUTION;
RANDOM block /SOLUTION;
RUN;

PROC MIXED COVTEST METHOD=TYPE3;
CLASS subject type; /* type is a characteristic of subject */
MODEL response = type /SOLUTION;
RANDOM subject(type) /SOLUTION;
LSMEANS type /DIFF;
RUN;

PROC GLM COVTEST;
CLASS subject type; /* type is a characteristic of subject */
MODEL response = type subject(type);
RANDOM subject(type) /TEST;
RUN;

PROC MIXED COVTEST;
CLASS town block plant treat ;
MODEL response = treat /SOLUTION;
RANDOM area plant area*plant block(area) /SOLUTION;
RUN;