R 的简介

R语言由新西兰奥克兰大学的 Ross Ihaka 和 Robert Gentleman 两人共同发明,其词法和语法分别源自 Scheme 和 S 语言,R 语言一般认为是 S 语言(John Chambers, Bell Labs, 1972)的一种方言。R 是“GNU S”, 一个自由的、有效的、用于统计计算和绘图的语言和环境,它提供了广泛的统计分析和绘图技术:包括线性和非线性模型、统计检验、时间序列、分类、聚类等方法。

关于S语言

  • S语言,一种用于统计的程式语言,由贝尔实验室的约翰·钱伯斯(John Chambers)、瑞克·贝克尔(Rick Becker)与艾伦·威尔克斯(Allan Wilks)在1975年至1976年共同研发。

  • S语言开发的目的是为了建立一套完备的图形系统。

  • R语言与S-PLUS是它的后继者。

  • John Chambers 的采访 字幕srt


关于 R

R本来是由来自新西兰奥克兰大学的罗斯·伊哈卡和罗伯特·杰特曼开发(也因此称为R),现在由“R开发核心团队”负责开发。R基于S语言的一个GNU计划项目,所以也可以当作S语言的一种实现,通常用S语言编写的代码都可以不作修改的在R环境下运行。R的语法是来自Scheme。

R的原始码可自由下载使用,亦有已编译的执行档版本可以下载,可在多种平台下运行,包括UNIX(也包括FreeBSD和Linux)、Windows和MacOS。R主要是以命令行操作,同时有人开发了几种图形用户界面。

关于R的大事记:

  • 1991: Created in New Zealand by Ross Ihaka and Robert Gentleman. Their experience developing R is documented in a 1996 JCGS paper.

  • 1993: First announcement of R to the public.

  • 1995: Martin Mächler convinces Ross and Robert to use the GNU General Public License to make R free software.

  • 1996: A public mailing list is created (R-help and R-devel)

  • 1997: The R Core Group is formed (containing some people associated with S-PLUS). The core group controls the source code for R.

  • 2000: R version 1.0.0 is released.


R 的特点

  • R is free!!!

  • Syntax is very similar to S, making it easy for S-PLUS users to switch over. Semantics are superficially similar to S, but in reality are quite different (more on that later).

  • Runs on almost any standard computing platform/OS (even on the PlayStation 3)

  • Frequent releases (annual + bugfix releases); active development.

  • Quite lean, as far as software goes; functionality is divided into modular packages

  • Graphics capabilities very sophisticated and better than most stat packages.

  • Useful for interactive work, but contains a powerful programming language for developing new tools (user -> programmer)

  • Very active and vibrant user community; R-help and R-devel mailing lists and Stack Overflow

  • R ’s joke


关于自由软件

With free software, you are granted

四项基本自由:

  • 自由度0:无论用户出于何种目的,必须可以按照用户意愿,可以随时随处自由地运行该软件。
  • 自由度1:用户可以自由地学习并修改该软件,以此来帮助用户完成用户自己的计算。作为前提,用户必须可以访问到该软件的源代码。
  • 自由度2:用户可以自由地分发该软件的拷贝。
  • 自由度3:用户可以自由地分发该软件修改后的拷贝。借此,用户可以把改进后的软件分享给整个社区令他人也从中受益。作为前提,用户必须可以访问到该软件的源代码。

http://www.fsf.org


R 的缺点

  • Essentially based on 40 year old technology.

  • Little built in support for dynamic or 3-D graphics (but things have improved greatly since the “old days”).

  • 虽然支持的包很多,但很多包是没有保证的。

  • 内存管理不够强大。

  • Not ideal for all possible situations (but this is a drawback of all software packages).


Design of the R System

The R system is divided into 2 conceptual parts:

  1. The “base” R system that you download from CRAN

  2. Everything else.

R functionality is divided into a number of packages.

  • The “base” R system contains, among other things, the base package which is required to run R and contains the most fundamental functions.

  • The other packages contained in the “base” system include utils, stats, datasets, graphics, grDevices, grid, methods, tools, parallel, compiler, splines, tcltk, stats4.

  • There are also “Recommend” packages: boot, class, cluster, codetools, foreign, KernSmooth, lattice, mgcv, nlme, rpart, survival, MASS, spatial, nnet, Matrix.


Some Useful Books on S/R

虽然很老,但都很经典:

Standard texts

  • Chambers (2008). Software for Data Analysis, Springer. (your textbook)

  • Chambers (1998). Programming with Data, Springer.

  • Venables & Ripley (2002). Modern Applied Statistics with S, Springer.

  • Venables & Ripley (2000). S Programming, Springer.

  • Pinheiro & Bates (2000). Mixed-Effects Models in S and S-PLUS, Springer.

  • Murrell (2005). R Graphics, Chapman & Hall/CRC Press.

Other resources

如何学习R?

这个没有固定答案,推荐看一下知乎上的讨论

返回课程主页