SAS
入门了解 SAS
并掌握SAS
基本使用
SAS
认证
常用统计程序的SAS
代码
The Little SAS Book (4th edition) 中文版 代码以及数据文件
Common Statistical Methods for Clinical Research with SAS Examples, Second Edition
SAS
软件入门DATA uspresidents;
INPUT President $ Party $ Number;
DATALINES;
Adams F 2
Lincoln R 16
Grant R 18
Kennedy D 35
;
proc print;
RUN;
DATA uspresidents;
INPUT President $ Party $ Number;
DATALINES;
Adams F 2
Lincoln R 16
Grant R 18
Kennedy D 35
;
proc print;
RUN;
baseball.xls
下载(右键另存)PROC IMPORT OUT= WORK.baseball
DATAFILE= "C:\MyRawData\Baseball.xls"
DBMS=EXCEL REPLACE;
RANGE="Sheet1$";
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
DATA toads;
INFILE 'c:\MyRawData\ToadJump.dat';
INPUT ToadName $ Weight Jump1 Jump2 Jump3;
RUN;
* Print the data to make sure the file was read correctly;
PROC PRINT DATA = toads;
TITLE 'SAS Data Set Toads';
RUN;
list input
读取空格分开的原始数据column input
读取按固定列排列的原始数据column input有如下优势:
原始数据记录如下:
DATA sales;
INFILE 'c:\MyRawData\OnionRing.dat';
INPUT VisitingTeam $ 1-20 ConcessionSales 21-24 BleacherSales 25-28
OurHits 29-31 TheirHits 32-34 OurRuns 35-37 TheirRuns 38-40;
RUN;
* Print the data to make sure the file was read correctly;
PROC PRINT DATA = sales;
TITLE 'SAS Data Set Sales';
RUN;
如下是一个农产品估重数据,每位农民要求对他们的番茄、南瓜、豌豆、葡萄进行估重: 下面代码从garden.dat原始文件中读取数据,并进行修改
Garden.dat 下载
DATA homegarden;
INFILE 'c:\MyRawData\Garden.dat';
INPUT Name $ 1-7 Tomato Zucchini Peas Grapes;
Zone = 14;
Type = 'home';
Zucchini = Zucchini * 10;
Total = Tomato + Zucchini + Peas + Grapes;
PerTom = (Tomato / Total) * 100;
RUN;
PROC PRINT DATA = homegarden;
TITLE 'Home Gardening Survey';
RUN;
DATA contest;
INFILE 'c:\MyRawData\Pumpkin.dat';
INPUT Name $16. Age 3. +1 Type $1. +1 Date MMDDYY10.
(Scr1 Scr2 Scr3 Scr4 Scr5) (4.1);
AvgScore = MEAN(Scr1, Scr2, Scr3, Scr4, Scr5);
DayEntered = DAY(Date);
Type = UPCASE(Type);
RUN;
PROC PRINT DATA = contest;
TITLE 'Pumpkin Carving Contest';
RUN;
条件语句IF-THEN的基本形式为:IF 条件 THEN 执行; 比如:IF Model=‘Mustang’ THEN Make=‘Ford’;
UsedCars.dat 下载
使用IF-THEN语句填满缺失值,并创建一个新变量Status
DATA sportscars;
INFILE 'c:\MyRawData\UsedCars.dat';
INPUT Model $ Year Make $ Seats Color $;
IF Year < 1975 THEN Status = 'classic';
IF Model = 'Corvette' OR Model = 'Camaro' THEN Make = 'Chevy';
IF Model = 'Miata' THEN DO;
Make = 'Mazda';
Seats = 2;
END;
RUN;
PROC PRINT DATA = sportscars;
TITLE "Eddy抯 Excellent Emporium of Used Sports Cars";
RUN;
Home.dat 下载
新建了一个CostGroup的变量。根据Cost的值将数据分成high、medium、low和missing三类:
DATA homeimprovements;
INFILE 'c:\MyRawData\Home.dat';
INPUT Owner $ 1-7 Description $ 9-33 Cost;
IF Cost = . THEN CostGroup = 'missing';
ELSE IF Cost < 2000 THEN CostGroup = 'low';
ELSE IF Cost < 10000 THEN CostGroup = 'medium';
ELSE CostGroup = 'high';
RUN;
PROC PRINT DATA = homeimprovements;
TITLE 'Home Improvement Cost Groups';
RUN;
* Choose only comedies;
DATA comedy;
INFILE 'c:\MyRawData\Shakespeare.dat';
INPUT Title $ 1-26 Year Type $;
IF Type = 'comedy';
RUN;
PROC PRINT DATA = comedy;
TITLE 'Shakespearean Comedies';
RUN;
DATA librarycards;
INFILE 'c:\MyRawData\Library.dat' TRUNCOVER;
INPUT Name $11. + 1 BirthDate MMDDYY10. +1 IssueDate ANYDTDTE10.
DueDate DATE11.;
DaysOverDue = TODAY() - DueDate;
Age = INT(YRDIF(BirthDate, TODAY(), 'ACTUAL'));
IF IssueDate > '01JAN2008'D THEN NewCard = 'yes';
RUN;
PROC PRINT DATA = librarycards;
FORMAT Issuedate MMDDYY8. DueDate WEEKDATE17.;
TITLE 'SAS Dates without and with Formats';
RUN;
* Using RETAIN and sum statements to find most runs and total runs;
DATA gamestats;
INFILE 'c:\MyRawData\Games.dat';
INPUT Month 1 Day 3-4 Team $ 6-25 Hits 27-28 Runs 30-31;
RETAIN MaxRuns;
MaxRuns = MAX(MaxRuns, Runs);
RunsToDate + Runs;
RUN;
PROC PRINT DATA = gamestats;
TITLE "Season's Record to Date";
RUN;