课程测验（一）:参考答案

数据是某国家332个观测站对空气中污染物的记录，有三个变量：

Date: the date of the observation in YYYY-MM-DD format (year-month-day) 记录时间
sulfate: the level of sulfate PM in the air on that date (measured in micrograms per cubic meter) 硫化物含量
nitrate: the level of nitrate PM in the air on that date (measured in micrograms per cubic meter) 硝化物含量

文件夹中有332个csv文件，每个文件名就是相应观测站的ID。

要求：

写一个三个参数的函数data_mean(dir,pollutant,1:332)能够返回相应观测站 pollutant 的均值。dir 是数据所在的目录，pollutant 是 sulfate 或者 nitrate

提示：可能用到的函数：

paste
formatC
read.csv
is.na
sapply
unlist

pollutants <- function(directory, pollutant, id) {
    fileName <- paste(formatC(id, width = 3, flag = "0"), ".csv", sep = "")
    filePath <- paste(directory, "/", fileName, sep = "")
    frame <- read.csv(filePath)
    pollutantData <- frame[[pollutant]]
    pollutantData[!is.na(pollutantData)]
}


data_mean <- function(directory, pollutant, id = 1:332) {
    pollutantLists <- sapply(id, function(id)      pollutants(directory, pollutant, id))
    pollutants <- unlist(pollutantLists)
    mean(pollutants)
}

data_mean("pollution_data","sulfate",1:10)

## [1] 4.064

data_mean("pollution_data", "nitrate", 67:72)

## [1] 1.186

data_mean("pollution_data","sulfate",23)

## [1] 0.8311

返回课程主页。