Indiana University   Hangeul KB Help Sitemap Calc Links

Import & Export

This page explains import and export of ASCII text, spreadsheet, and database files in SAS.

SAS icon

OVERVIEW

In UNIX, PROC IMPORT and PROC EXPORT must be executed under the X window. Otherwise, you have to add "noterminal" to a command (e.g., "sas -noterminal file_name.sas").

In SAS, there are various data sources as shown in the following figure. In general, SAS reads data using the INFILE statement and PROC IMPORT. You may use the PUT statement in a Data Step or PROC EXPORT to export data sets into external files. INFILE should be used in a DATA step, while PROC IMPORT and PROC EXPORT are independent procedures.

Data Source in SAS

The INFILE statement reads data directly using the DATALINES (CARDS) statement, imports various ASCII text files, and imports data sets through network (i.e., FTP and HTTP). INFILE and DATALINES also read data in a matrix form. PROC IMPORT reads ASCII text files, database (ACCESS, dBASE), and spreadsheet (Lotus 1-2-3, Excel).

If you have simple data, read them using INFILE and DATALINES; otherwise, use PROC IMPORT. Personally, I prefer the IMPORT Wizard to PROC IMPORT due to its user-friendly interface and flexibility. If you have a data set generated in other software packages (e.g., Excel, dBASE III, Paradox, Stata, and SPSS), use a data conversion utility like Stat/Transfer.

SAS icon

INFILE STATEMENT

The INFILE statement identifies external files to be read in the INPUT statement. It reads an ASCII text file that is typically delimited by a space (default), Tab, comma, or other delimiters. Let us take the following example first.

LIBNAME js 'c:\sas';
FILENAME egov 'c:\sas\egov.txt';

DATA js.seoul;
   INFILE egov LRECL=250 FIRSTOBS=7 OBS=700;
   INPUT name $ male training corrupt negative;
RUN;

SAS icon

INFILE OPTIONS

DATA js.miss;
INFILE DATALINES MISSOVER;
INPUT id korean english;

DATALINES;
01 87 65
02 86
03 95
04
RUN;

The above example reads four observations with missing. Note that "english" of the second observation is missing since consecutive blanks are interpreted as a delimiter. If MISSOVER option is omitted, SAS reads only two observations with data massed up; the "english" of the second observations is set "03". If STOPOVER option is used instead of MISSOVER, SAS will stop at the second data line since there is not enough data values for variables specified.

SAS icon

IMPORT & EXPORT PROCEDURES

Since the version 8.0, SAS provides the IMPORT procedure to read various types of external files, such as ACESS, EXCEL, and dBASE III. In previous versions, the DBF and DIF procedures deal with dBASE III and spreadsheet files, respectively.

The IMPORT procedure creates data sets from various types of external files. So an additional DATA step is not necessary unless the data set needs to be manipulated. In other words, the IMPORT procedure do not require the INPUT and DATALINES (or CARDS) statements.

There are two ways of using the IMPORT procedure. One is to write a IMPORT statement in a standard SAS program, the other is to use the IMPORT/EXPORT Wizard which is available under the FILE menu. Personally, I prefer the IMPORT Wizard to the procedure, since the former provides a user-friendly interface, high flexibility, and other useful features.

Is the IMPORT statement always better than the INFILE? It depends. For example, when you need to read a huge ACCESS file (say larger than 1GB), but only several variables are needed for analyses, the INFILE statement is recommended.

SAS icon

IMPORT & EXPORT WIZARD

The Import/Export Wizard provides extremely flexible ways of reading external data. The wizard works in the graphic user interface (GUI) environment such as X-window and Microsoft Windows. If your data file is messy and ill-organized, the wizard will be a good solution.

The Import/Export Wizard actually works in the same way that the Import and Export procedures in a SAS program do; The Wizard can generate SAS programs for Import and Export procedures.

SAS Import Wizard

In order to use the Wizard,

  1. Click "File" from the top menu bar and choose "Import Data..."
  2. Choose a file type from the list. You may choose "User-defined formats" for messy data.
  3. Locate an external file to be imported.
  4. Provide a library and SAS data set name.
  5. If you want SAS to create a SAS program file that is equivalent to what the Wizard does, specify the file name.
SAS Import Wizard
SAS icon

READ DIRECT DATA ENTRY

You can directly input data in a Data Step. INFILE DATALINES tells SAS to construct a data set from data lines after the DATALINES statement. The following example reads a numeric variable id, a string variable depart, and a numeric variable price. Note that data items are delimited (separated) by a space. In this case of the space-delimited format, INFILE DATALINES can be omitted.

DATA fruit;
INFILE DATALINES;
INPUT id depart $ employee;

DATALINES;
1 sales 16
2 finance 25
3 research 10
RUN;

SAS icon

IMPORT & EXPORT ASCII TEXT FILES

The above example implies that ASCII text data (egov.txt) are delimited with space (or blank). If a ASCII file is delimited by other than a space, you need to explicitly specify delimiter using DELIMITER (or DLM) option with using various characters or combinations of characters as a delimiter. For example, DLM='09'x for Tab, DLM='&' for ampersand , DLM='^' for caret, DLM=';' for semicolon, DLM='END', and so forth.

DATA js.seoul;
   INFILE egov.txt DELIMITER=',';
   INPUT name $ male training corrupt negative;
RUN;

The INFILE is also able to read the data stream of DATALINES (or CARDS). You may specify DATALINES instead of an external file name. Note that each data element in the following example is separated by caret (^).

DATA js.fruit;
INFILE DATALINES DELIMITER='^';
INPUT id fruit $ sales;

DATALINES;
1^Grape^100
2^Pear^77
RUN;

What if a comma is used as a delimiter and there is a missing value in the second data line? The DSD option is the answer. DSD reads a value as missing between two consecutive delimiters (,,). Without this option, the above program reads only first observation.

DATA fruit;
INFILE DATALINES DLM=',' DSD;
INPUT id fruit $ sales;

DATALINES;
1,Grape,100
2,,77
RUN;

If you want to export a data set into an external file, use PUT statement. PUT allows you to control the output format flexibly, but it is a bit difficult, especially for beginners, to use this statement correctly.

Now, let us use IMPORT and EXPORT procedures to handle ASCII text files. You need to specify delimiter of the ASCII text file in the DBMS option.

PROC IMPORT DATAFILE="c:\sas\ego.csv" OUT=jeeshim.egov DBMS=CSV REPLACE;
   GETNAMES=YES;
   DATAROW=2;
RUN;

The following example exports a data set to a space delimited ASCII text file. The DATA and the OUTFILE respectively specify a data set to be exported and an external file to which data are exported.

PROC EXPORT
   DATA=jeeshim.egov
   OUTFILE="c:\sas\ego.txt"
   DBMS=DLM REPLACE;
RUN;

SAS icon

IMPORT & EXPORT EXCEL FILES

PROC IMPORT imports spreadsheet files such as Excel and Lotus 1-2-3.

PROC IMPORT DATAFILE="c:\sas\assets.xls" OUT=js.pc DBMS=EXCEL2000 REPLACE;
   SHEET="computer";
   GETNAMES=YES;
RUN;

The following example exports a data set to an Excel file. Notice that there is only one semicolon at the end of EXPORT statement.

PROC EXPORT
   DATA=js.egov
   OUTFILE="c:\sas\egov.xls"
   DBMS=EXCEL2000 REPLACE;
RUN;

SAS icon

IMPORT & EXPORT dBASE III FILES

A dBase III file has ony one table that has a well defined data structure. You may omit the "DBMS=DBF" option since SAS can recognized file format with the extension, "dbf."

PROC IMPORT
   DATAFILE="c:\sas\comuter.dbf"
   OUT=js.pc
   DBMS=DBF REPLACE;
RUN;

The following example exports a data set to a dBase III file.

PROC EXPORT
   DATA=js.pc
   OUTFILE="c:\sas\pc.dbf"
   DBMS=DBF REPLACE;
RUN;
SAS icon

IMPORT & EXPORT ACCESS FILES

Unlike a dBASE III or FoxPro file, an Access file can has more than one table with security features. Thus, you have to provide database, table, account (identification), and password to access a database file.

PROC IMPORT TABLE="comuter" OUT=js.pc DBMS=ACCESS REPLACE;
   UID=""; PWD=""; WGDB="";
   DATABASE="c:\sas\asset.mdb";
RUN;

The EXPORT procedure can export a SAS data set to a Access table. Note that the OUTTABLE specifies a table name to be generated in the ACCESS file.

PROC EXPORT
   DATA=js.egov
   OUTTABLE="egov2000"
   DBMS=ACCESS2000 REPLACE;
   DATABASE="c:\sas\egov.mdb";
RUN;

SAS icon

USING NETWORK RESOURCES

SAS can access data through FTP, HTTP, and TCP/IP. This network functionality provides high flexibility and convenience in the information era.

The following example uses FTP to access the data resource. Note that FTP, USER, PASS, and HOST are reserved words for protocol, account name, password, and host computer name, respectively. The PROMPT specifies to prompt for the user login password, if necessary.

FILENAME myftp FTP 'gov.txt' USER='kucc625' PASS='xxxxx' PROMPT
   HOST='mdss.iu.edu' CD='/sas/egov';

DATA js.egov;
   INFILE myftp;
   INPUT year country $ domain index;
RUN;

The following examples use URL to access data files available on a website.

FILENAME myurl URL 'http://masil.org/archives/airline.txt';

DATA js.egov;
   INFILE myurl;
   INPUT airline year output cost fuel load;
RUN;

FILENAME myurl URL 'http://www.masil.org/archives/smoking.txt';

DATA js.smoking;
   INFILE myurl FIRSTOBS=34;
   INPUT state $ cigar bladder lung kidney leukemia area;
RUN;

In order to use the TCP/IP connection, you have to use the SOCKET instead of URL.

SAS icon

USING INFORMATS

If data include comma in numbers, dollar ($), and percent (%), you have to use informats such as COMMAn.n, DOLLARn.n, and PERCENTn.n in the formatted input style.

DATA sales;
INFILE DATALINES;
INPUT year quan COMMA9.0 sales DOLLAR10.0 rate PERCENT3.;

DATALINES;
2000 184,871 $2,875,879 80%
2001 875,877 $5,987,972 89%
RUN;

If data are in date informats, you may use MMDDYY10. and MMDDYY8. formats. Note that the period (.) in the third observation indicates a missing value.

DATA book;
INFILE DATALINES;
INPUT start MMDDYY10. end MMDDYY8.;

DATALINES;
08/12/1999 03/01/04
03/02/1987 02/25/91
03/02/1991 .
RUN;
SAS icon

READING MATRICES

SAS can also read such matrix forms as correlation coefficients (TYPE=CORR), covariance (TYPE=COV), and parameter estimators (TYPE=EST).

DATA corr_mat (TYPE=CORR);
INFILE DATALINES MISSOVER;
matrix ='Correlation';
INPUT _NAME_ $ var1 var2 var3;

DATALINES;
var1 1.00000
var2 0.25757 1.00000
var3 0.57844 0.54865 1.00000
RUN;

SAS icon

USING CLIPBOARD

SAS can read text data from and write text data to the clipboard.

You may import a worksheet of an Excel file. First highlight the part of worksheet in Excel and copy it into the Windows Clipboard. Suppose you choose 5 variables.

FILENAME clipboard CLIPBRD;

DATA excel;
   INFILE clipboard
   INPUT x1-x5;
RUN;
SAS icon

ACCESSING THROUGH ODBC

SAS can access data through the ODBC connection. You need to define a DSN for database or spreadsheet files in advance ("access_dsn").

  1. Run the "ODBC Data Source Administrator" at the Control Panel
  2. Click "System DSN" and then choose "Addd" to create a new DSN
  3. Select a proper driver (mdb, xls, etc.) and then provide DSN name and description
  4. Specify database, spreadsheet, or database server (local or SQL server name)
  5. Click "Advanced" to enter login ID and password, if necessary

The DSN is connected by the CONNECT statement in the SQL procedure. A series of SQL statements such as CREATE and SELECT follow the CONNECT statement.

PROC SQL;;
CONNECT TO ODBC AS db_con
   (DATASRC="access_dsn" USER=kucc625 PASSWORD=xxxx);
SELECT * FROM CONNECTION TO ODBC(SELECT * FROM egov2004);
...;
QUIT;

SAS icon

DATA CONVERSION UTILITIES

SAS cannot directly read customized data formats such as STATA *.dta. Accordingly, you have to export a data set in the software into a general file format (e.g., *.csv) and then import it into a SAS data set. Or you may use the COPY procedure to create a transport file that can be recognized by other software.

However, these tasks often become cumbersome and burdensome to most researchers. The most efficient solution in this case is to use professional data transferring utilities such as Stat/Transfer and DBMS/COPY that transfer data from one software to another.

These utilities support a variety of file formats such as SAS, Stata, Gauss, Rats, ACCESS, dBASE, FoxPro, Paradox, Lotus 1-2-3, Quattro, Excel, Sigmaplot, Minitab, and SPSS.

SAS icon

REFERENCES