Korea University
Korea Hangeul Help Sitemap Calc Link
SAS Statement & Function
This page briefly explains how to use SAS statements and functions. This page was separated from the SAS DATA step page on December 2005.

This web document may not be used for any commercial purposes. This page may contain some mistakes and errors. If you have any question and suggestion, please leave a message on SAS bulletin board.

Overview | DO LOOP | IF | SELECT | ARRAY | Operators | Math/Stats | String/Date | Probability & Randomization | FORMAT/LABEL | INFORMAT | RETAIN | PUT | FILE | WHERE | TITLE/OPTIONS

SAS STATEMENT OVERVIEW

A SAS program is a collection of SAS statements. SAS statements used in a DATA step are either executable or declarative.

SAS executable statements include ABORT, CALL, CONTINUE, DELETE, DESCRIBE, DISPLAY, DO, DO UNTIL, DO WHILE, ERROR, EXECUTE, FILE, IF-THEN/ELSE, INPUT, INFILE, GO TO, LEAVE, LINK, LIST, LOSTCARD, MERGE, MODIFY, OUTPUT, PUT, REDIRECT, REMOVE, REPLACE, RETURN, MERGE, RETURN, SELECT, SET, STOP, and UPDATE.

SAS declarative statements include ARRAY, ATTRIB, BY, CARDS, CARDS4, DATALINES, DATALINES4, DROP, END, FORMAT, INFORMAT, KEEP, LABEL, LENGTH, RENAME, RETAIN, WHERE, and WINDOW.

Up

DO LOOP

SAS has loop structures of DO ... END, DO WHILE ... END, and DO UNTIL ... END.

  • DO; month=year*12; END;
  • DO str='Mon', 'Tue', 'Wed', "Thu', 'Fri'; ...; END;
  • DO num=6, 2, 5; ...; END;
  • DO var=x1, x2, x3; ...; END;
  • DO i=1 TO 10; ...; i=i+2; END;
  • DO i=1 TO 10 BY 2; ...; END;
  • DO i=j TO k+1; ...; END;
DO WHILE(n>=10);
   ...;
END;

DO i=1 TO 10 WHILE(x<=5);
   ...;
END;

DO i=1 TO 10 BY 1 WHILE(week='Mon');
   ...;
END;

DO UNTIL(n>=10);
   ...;
END;

DO i=2, 4, 6, 7 UNTIL(x<=5);
   ...;
END;

Up

IF STATEMENT
  • IF male=1; /* to keep observations that meet the condition */
  • IF score; /* observations whose scores are not 0 nor missing */
  • IF male=1 THEN DELETE;
  • IF male=1 AND grade='F' THEN DELETE;
IF male=1 THEN gpa=gpa+1;
ELSE gpa=gpa+.5;

IF grade='A' THEN gpa=4.0;
ELSE IF grade='B' THEN gpa=3.5;

IF score LT 90 THEN DO;
   ...;
   END;
ELSE DO;
   ...;
   END;

Up

SELECT STATEMENT
...
SELECT (condition);
   WHEN (1) x=x;
   WHEN (2) x=x*2;
   OTHERWISE x=x-1;
END;
...

...
SELECT (str);
   WHEN ('Sun') wage=wage*1.5;
   WHEN ('Sat') wage=wage*1.3;
   OTHERWISE DO; wage=wage+1; bonus=0; END;
END;
...

Up

ARRAY STATEMENT

There are various ways of defining Arrays.

  • ARRAY score{3} korean math english;
  • ARRAY score{*} korean math english;
  • ARRAY score{3} x1-x3;
  • ARRAY score{3}
  • ARRAY score{3} korean math english (100 100 100);
  • ARRAY score{3} korean math english (100 2*100);
  • ARRAY names{3} first last middle ('f', 'l', 'm');
  • ARRAY score{2:3} korean math;
  • ARRAY score{2,3} x1-x6; /* 2 by 3 */
  • ARRAY score{12} (2*1:6);

Referring Arrays

  • name{1}; name{2}
  • score{1}; score(1); score[1];
  • score{i};
  • score{i+1};

Array Functions

  • DIM(array)
  • HBOUND(array); LBOUND(array)
...
ARRAY score{*} score1-score5;
DO OVER score; /*DO i=1 TO DIM(score);*/

   score{i]=score{i}*1.5;
END;

total=SUM(OF score{*});

DO i=1 TO DIM(score);
   IF score{i]<=50 THEN grade='F';
END;
...

Up

SAS OPERATORS

SAS has arithmetic, relational, logical, and concaternation operators. But it does not have an operator for the modulus, but has the MOD funciton instead.

  • Arithmetic: +, -, *, /, **, ^, >< (minimum), <>(maximum)
  • Relational: < or LT, > or GT, <= or LE, >= GE, = or EQ, ^= or NE
  • Logical: & or AND, | or OR, ^ or NOT
  • Concatenation: ||

Up

MATHEMATICAL/STATISTICAL FUNCTIONS

Mathematical Functions

  • ABS(), MOD(number, modulo), SIGN(), SQRT()
  • EXP(); LOG(); LOG2(); LOG10()
  • CONSTANT(); CONSTANT('E'); CONSTANT('EULER'); CONSTANT('PI')
  • FACT(); DIGAMMA(), ERF(), ERFC(), GAMMA(), LGAMMA()
  • Time-series: DIF(); DIF2(); LAG(); LAG3()
  • Trigonometric: SIN(); COS(); TAN(); ARSIN(); ARCOS(); ATAN()
  • Truncation: CEIL(); FLOOR(); INT(); ROUND();
  • TRUNC(float, precision)

Statistical Functions

  • SUM(); MEAN(); MEDIAN(); SUM(x, y, z);MEAN(OF a1-a10);
  • MAX(), MIN(), RANGE(); IQR(); PCTL(75, OF x1-x100);
  • SMALLEST(rank, number); LARGEST(rank, number);
  • KURTOSIS(); SKEWNESS();
  • N(), NMISS(# obs with missing value)
  • STD(); STDERR(); VAR(); USS(uncorrected sum of squares); CV()

Conversion

  • INPUT(str, informat); INPUT('12/21/2005', MMDDYY10.)
  • INPUTC(number, str_informat); INPUTN(number, num_informat);
  • PUT(number, format); PUT(x, 5.2); PUTC(); PUTN()
  • FIPNAME(); FIPSTATE(); STFIPS() STNAME()
  • ZIPFIPS(); ZIPNAME(); ZIPSTATE()
  • ATTRC(DSID, 'ENGINE'); ATTRC(DSID, 'CHARSET')

Up

STRING/DATE FUNCTIONS

SAS has various functions to handle string variables.

  • TRIM(); LTRIM(); RTRIM(); STRIP();
  • SUBSTR(str, start, length); SUBSTRN(str, start, length)
  • REPEAT(str, repetition)
  • UPCASE(); LOWCASE(); PROPCASE();
  • COMPBL(str); COMPRESS(str, str_to_be_removed)
  • LENGTH(); LENGTHN(); LENGTHC();
  • LEFT(); RIGHT();
  • ANYALNUM(); ANYALPHA(); ANYDIGIT(); ANYFUNCT(); ANYSPACE();
  • NOTALNUM(); NOTALPHA(); NOTDIGIT(); NOTUPPER()
  • FIND(str, str_key); FIND(str, str_key, start); FINDC();
  • REVERSE(str); COMPARE(str1, str2); COUNT(str, str_key)
  • MISSING(); RANK(); VERIFY(str, str_key)

Date and time related functions include,

  • YEAR(); QTR(date); MONTH(date); WEEK(date); WEEKDAY(date);
  • DATE(); DAY(date); DATEPART(date_time)
  • HOUR(time); MINUTE(time); SECOND(time); TIMEPART(date_time)
  • MDY(month, day, year); DHMS(date, hour, minute, second); HMS(hour, minute, second);
  • DATETIME(); TIME()

Up

PROPABILITY/RANDOMIZATION FUNCTIONS

SAS has various probability distribution functions.

  • PROBNORM(standard normal); PROBIT(quantile from standard normal)
  • POISSON(); PROBCHI(chi squared); PROBF(f); PROBT(T)
  • PROBNML(binomial); PROBNEGB(negative binomial)
  • PROBBETA(beta); PROBGAM(gamma); PROBHYPR(hypergeometric)

Random Number Generators

  • RAND(); RANNOR(normal); RANUNI(uniform); RANPOI(poisson)
  • RANBIN(binomial); RANEXP(exponential); RANGAM(gamma);
  • RANCAU(cauchy); RANTBL(tabled); RANTRI(triangular)

Up

FORMAT & LABEL STATEMENTS

In order to label variables, include LABEL statements in a DATA step.

DATA js.egov_labled1;
SET js.egov;
LABEL male = 'Respondents'' Gender' zip = "Zip Code";
RUN;

In order to label values of variables, define value labels using the FORMAT procedure and include LABEL statements in a DATA step.

PROC FORAMT;
VALUE male_label 1 = "Male" 0 = "Female";
VALUE $field_label 'Math' = "Mathematics" 'Stat'="Statistics";
RUN;

DATA js.egov_labled2;
SET js.egov;
FORMAT male male_label. field $field_label.;
RUN;

Up

INFORMAT STATEMENT

An INFORMAT statement in a Data step permanently associates an inforamat with a variable. When reading a string variable, INFORMAT can read up to 76 characters. In the following DATA step, you may not ignore _ since it is free INPUT, in which spaces are used to separate data points.

DATA js.grade;
INFORMAT id 7.0 name $35;
INPUT id name;
DATALINES;
1234567 Hun_Myoung_Park
RUN;

Up

RETAIN STATEMENT

The RETAIN statement causes variables created by the INPUT or assignment statements to retain its value from one iteration of the DATA step to the next.

  • RETAIN; RETAIN _ALL_; /* all variables */
  • RETAIN var1 1 var2 1 var3 1; /* Initialize numeric variables */
  • RETAIN var1-3 1; /* Initialze numeric variables */; /
  • RETAIN state birth 'IN'; /* Initialize character variables */
  • RETAIN var1-var3 (1); /* initial value only for var1 */
  • RETAIN var1-var3 (1 2 3); RETAIN var1-var3 (1:3);

The following example computes the sum of 1 through 100.

DATA _NULL_;
RETAIN sumx 0;
DO i=1 TO 100;
   sumx=sumx+i;
END;
FILE PRINT;
PUT 'The sum of 1 through 100 is ' sum;
RUN;

The following example computes the sum of variable x.

DATA _NULL_;
SET survey; RETAIN sumx 0;
sumx = sumx + x;
FILE PRINT;
PUT 'The sum of the varialbe x is ' sumx;
RUN;

The following example takes 25 random samples from the data set. This approach is, however, not recommended in a strict econometric sense. Instead, use the SURVEYSELECT procedure.

DATA sample;
SET survey;
RETAIN n 100 k 25;
prob=k/n;
IF UNIFORM(0) < prob THEN DO;
   OUTPUT;
   k=k-1;
END;
n = n-1;
RUN;

Up

PUT STATEMENT

The PUT statement writes lines to SAS log, output window, or an external file specified in the most recent FILE statement.

Like the INPUT statement, the PUT has list output, formatted output, and named output format. In addition, you can use colum pointer controls (e.g., @n, @var, @exp, +n, +var, +exp, @@) and line pointer controls (e.g., #n, #var, #exp, /).

PUT 'Jeeshim and KUCC625's Webpage';
PUT name id phone; /* list output */
PUT @15 name $10.; /* in the "$10." format at the 15th column */
PUT name $CHAR10. score 4.1 date mmddyy10.;
PUT math 3. +5 english 3. +5 average 5.2;
PUT name $15. / id 18-19 #2 phone $12.; /* next line */
PUT name $15. +3 id 18-19 #2 phone $12.; /* 3rd line */
PUT name= id= phone=; /* named output */
PUT name $15. ' was born on ' date mmddyy8.;
PUT name $15. OVERPRINT 15*'_'; /* Underline*/
PUT arr_var{*}; /* Write an array*/

Up

FILE STATEMENT

The FILE statement specifies the current output file for PUT statements.The default is PRINT for the output window (FILE PRINT;). The LOG option redirects the output to the log window (FILE LOG;). You may specify a file in which the output is stored in the ASCII text format (FILE 'c:\temp\orchid.txt';).

FILE PRINT;
FILE LOG DSD;/* Delimiter sensitive data */
FILE 'c:\temp\output.txt';
FILENAME a b 'c:\temp\output.txt';...; FILE a;

Useful options include DELIMITER='delimiter'; NOFOOTNOTES; LINESIZE=n; LRECL=n; PAGESIZE=n; NOPRINT; and NOTITLES.

The following example collects specific statistics from a data set "prelim" containing SAS REG procedure outputs. F value, R square, and three coefficients will be taken out. Each SAS output has 29 lines in the same format.

DATA _NULL_;
   SET prelim;
   FILE 'c:\temp\reg_data.txt' NOTITLES NOPRINT;
   IF MOD(_N_+17,29)=0 THEN PUT F @@;
   IF MOD(_N_+12,29)=0 THEN PUT R @@;
   IF MOD(_N_+2,29)=0 THEN PUT B @@;
   IF MOD(_N_+1,29)=0 THEN PUT B @@;
   IF MOD(_N_,29)=0 THEN PUT B;
RUN;

Note that _NULL_ orders not to create the data set and @@ holds the line so that all the statistics appear on the same line (record).

Up

WHERE STATEMENT

The WHERE statement selects observations from SAS data sets when excuting data and procedure steups. But WHERE is not allowed when reading an external files and in-stream data using the DATALINES statement.

You may use arithmetic, relational, and logical operators when specifying conditions. WHERE has following its own operators.

  • BETWEEN-AND: WHERE income BETWEEN 50 AND 60;
  • ? or CONTAINS: WHERE city CONTAINS 'ton';
  • IS NULL or IS MISSING: WHERE male IS MISSING;
  • LIKE: WHERE name LIKE 'A%'; /* names beginning with A; '_' means a character */
  • =*: WHERE name =* 'Park'; /* Sounds like 'Park' */
  • SAME-AND

Up

TITLE/OPTIONS STATEMENT

The TITLE statement adds titles up to 10 lines. The statement without a title eliminates the title.

TITLE 'First Line';
TITLE2 'Second Line';
...
TITLE10 'Tenth Line';

The OPTIONS statement changes values of SAS system options. NOLABEL and NONUMBER suppress labels of variables and page numbers, respectively. FIRSTOBS and OBS respectively specify the first and last observations to be used in a proceudre. When jointly used with WHERE, OBS=n selects first n observations who meet the conditions specified in WHERE.

OPTIONS NODATE LINESIZE=80 PAGESIZE=55 NOCENTER NOLABEL MISSING='M';
OPTIONS PAGENO=1 NONUMBER;
OPTIONS FIRSTOBS=2 OBS=12;
OPTIONS LEFTMARGIN=1 RIGHTMARGIN=1 TOPMARGIN=5 BOTTOMMARGIN=5;

REFERENCES
  • SAS Institute. 2005. SAS Language Reference: Concepts, 2nd ed., Version 9. Cary, NC: SAS Institute.
  • SAS Institute. 2005. SAS Language Reference: Dictionary, 2nd ed., Version 9. Cary, NC: SAS Institute.
  • Cody, Ron. 2004. SAS Functions by Example. Cary, NC: SAS Institute.

Up