PING
0.9
Statistical data handling and processing in production environment
|
Import (convert) a file from any format accepted by PROC IMPORT
into a SAS dataset.
ifn
: file name to import;fmt
: (option) format for import, i.e. extension of the input file; it can be any format (e.g., csv) accepted by the DBMS key in PROC import
;idir
: (option) input directory where the file is stored; note that it may be also passed directly in ifn
; default: empty, the location depends on ifn
;olib
: (option) output library where the dataset will be stored; by default, olib=WORK
is selected as output library;guessingrows
: flag set to the number of rows (observations) to be read so as to guess the type of a column (variable); guessingrows
must be an integer >0, or _MAX_
when all the rows need to be read so as to define the type of the variable that is imported (in practice, _MAX_
corresponds to the integer value 2147483647); default: guessingrows
is not set and the first row is used to guess the type of the variable;getnames
: boolean flag (yes/no
) set to import the variable names; default: getnames=yes
.odsn
: (option) name of the output dataset; otherwise, odsn
is automatically built from the basename of the input file name ifn
;_ods_
: (option) name (string) of the macro variable storing the name of the output dataset; useful when odsn
has not been set.Run macro %_example_file_import
for examples.
REPLACE
option is not specified, the PROC IMPORT
procedure does not overwrite an existing data set.G_PING_DEBUG=1
), the import process is aborted; still it can checked that the output dataset will be created with the correct name and location using the option _ods_
. Consider using this option for checking before actually importing.guessingrows=MAX
. For instance, let us consider the following CSV table: var1,var2,var3,var4,var5,var6 ,,,,, 1,,X,,1,1 X,1,,,, X,1,,,, ,,,,, ,,1,1,,the variables var5
and var6
(while equal) are imported as numerical and alphanumerical respectively, though they should bot be imported as numerical. This issue occurs when the file has been created on a Windows PC and SAS is running on Unix/Linux. One difference between both operating systems is the newline char. While Windows uses Carriage Return and Line Feed, the *nix-systems use only one of the chars. The problem can be solved in different manners:
PROC IMPORT
, INFILE
statement together with the TERMSTR
option, which shall take the value:TERMSTR=CRLF
(Carriage Return Line Feed) to read Windows formatted files (default on Windows platforms),TERMSTR=LF
(Line Feed) to read UNIX formatted files (default on UNIX systems),TERMSTR=CR
(Carriage Return) to read MAC formatted files;Instead, we will use the macro %handle_crlf
implemented by V.Nguyen which reads like:
which automatically detect line break options with termstr as CRLF
or LF
(also default) when importing data. This assumes SAS is running on a UNIX server with access to the head
and awk
commands. Original source code (no license, no disclaimer) is available at http://blog.nguyenvq.com/blog/2015/10/09/automatically-specify-line-break-options-with-termstr-as-crlf-or-lf-in-sas-when-importing-data/.