SAS 入门了解 SAS并掌握SAS 基本使用
SAS认证
常用统计程序的SAS代码
The Little SAS Book (4th edition) 中文版 代码以及数据文件
Common Statistical Methods for Clinical Research with SAS Examples, Second Edition
SAS软件入门DATA uspresidents;
INPUT President $ Party $ Number;
DATALINES;
Adams F 2
Lincoln R 16
Grant R 18
Kennedy D 35
;
proc print;
RUN;
DATA uspresidents;
INPUT President $ Party $ Number;
DATALINES;
Adams F 2
Lincoln R 16
Grant R 18
Kennedy D 35
;
proc print;
RUN;
PROC IMPORT OUT= WORK.baseball
DATAFILE= "C:\MyRawData\Baseball.xls"
DBMS=EXCEL REPLACE;
RANGE="Sheet1$";
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
DATA toads;
INFILE 'c:\MyRawData\ToadJump.dat';
INPUT ToadName $ Weight Jump1 Jump2 Jump3;
RUN;
* Print the data to make sure the file was read correctly;
PROC PRINT DATA = toads;
TITLE 'SAS Data Set Toads';
RUN;
list input读取空格分开的原始数据column input 读取按固定列排列的原始数据column input有如下优势:
原始数据记录如下:
DATA sales;
INFILE 'c:\MyRawData\OnionRing.dat';
INPUT VisitingTeam $ 1-20 ConcessionSales 21-24 BleacherSales 25-28
OurHits 29-31 TheirHits 32-34 OurRuns 35-37 TheirRuns 38-40;
RUN;
* Print the data to make sure the file was read correctly;
PROC PRINT DATA = sales;
TITLE 'SAS Data Set Sales';
RUN;
如下是一个农产品估重数据,每位农民要求对他们的番茄、南瓜、豌豆、葡萄进行估重: 下面代码从garden.dat原始文件中读取数据,并进行修改
Garden.dat 下载
DATA homegarden;
INFILE 'c:\MyRawData\Garden.dat';
INPUT Name $ 1-7 Tomato Zucchini Peas Grapes;
Zone = 14;
Type = 'home';
Zucchini = Zucchini * 10;
Total = Tomato + Zucchini + Peas + Grapes;
PerTom = (Tomato / Total) * 100;
RUN;
PROC PRINT DATA = homegarden;
TITLE 'Home Gardening Survey';
RUN;
DATA contest;
INFILE 'c:\MyRawData\Pumpkin.dat';
INPUT Name $16. Age 3. +1 Type $1. +1 Date MMDDYY10.
(Scr1 Scr2 Scr3 Scr4 Scr5) (4.1);
AvgScore = MEAN(Scr1, Scr2, Scr3, Scr4, Scr5);
DayEntered = DAY(Date);
Type = UPCASE(Type);
RUN;
PROC PRINT DATA = contest;
TITLE 'Pumpkin Carving Contest';
RUN;
条件语句IF-THEN的基本形式为:IF 条件 THEN 执行; 比如:IF Model=‘Mustang’ THEN Make=‘Ford’;
UsedCars.dat 下载
使用IF-THEN语句填满缺失值,并创建一个新变量Status
DATA sportscars;
INFILE 'c:\MyRawData\UsedCars.dat';
INPUT Model $ Year Make $ Seats Color $;
IF Year < 1975 THEN Status = 'classic';
IF Model = 'Corvette' OR Model = 'Camaro' THEN Make = 'Chevy';
IF Model = 'Miata' THEN DO;
Make = 'Mazda';
Seats = 2;
END;
RUN;
PROC PRINT DATA = sportscars;
TITLE "Eddy抯 Excellent Emporium of Used Sports Cars";
RUN;
Home.dat 下载
新建了一个CostGroup的变量。根据Cost的值将数据分成high、medium、low和missing三类:
DATA homeimprovements;
INFILE 'c:\MyRawData\Home.dat';
INPUT Owner $ 1-7 Description $ 9-33 Cost;
IF Cost = . THEN CostGroup = 'missing';
ELSE IF Cost < 2000 THEN CostGroup = 'low';
ELSE IF Cost < 10000 THEN CostGroup = 'medium';
ELSE CostGroup = 'high';
RUN;
PROC PRINT DATA = homeimprovements;
TITLE 'Home Improvement Cost Groups';
RUN;
* Choose only comedies;
DATA comedy;
INFILE 'c:\MyRawData\Shakespeare.dat';
INPUT Title $ 1-26 Year Type $;
IF Type = 'comedy';
RUN;
PROC PRINT DATA = comedy;
TITLE 'Shakespearean Comedies';
RUN;
DATA librarycards;
INFILE 'c:\MyRawData\Library.dat' TRUNCOVER;
INPUT Name $11. + 1 BirthDate MMDDYY10. +1 IssueDate ANYDTDTE10.
DueDate DATE11.;
DaysOverDue = TODAY() - DueDate;
Age = INT(YRDIF(BirthDate, TODAY(), 'ACTUAL'));
IF IssueDate > '01JAN2008'D THEN NewCard = 'yes';
RUN;
PROC PRINT DATA = librarycards;
FORMAT Issuedate MMDDYY8. DueDate WEEKDATE17.;
TITLE 'SAS Dates without and with Formats';
RUN;
* Using RETAIN and sum statements to find most runs and total runs;
DATA gamestats;
INFILE 'c:\MyRawData\Games.dat';
INPUT Month 1 Day 3-4 Team $ 6-25 Hits 27-28 Runs 30-31;
RETAIN MaxRuns;
MaxRuns = MAX(MaxRuns, Runs);
RunsToDate + Runs;
RUN;
PROC PRINT DATA = gamestats;
TITLE "Season's Record to Date";
RUN;