数据是某国家332个观测站对空气中污染物的记录,有三个变量:
Date: the date of the observation in YYYY-MM-DD format (year-month-day) 记录时间
sulfate: the level of sulfate PM in the air on that date (measured in micrograms per cubic meter) 硫化物含量
nitrate: the level of nitrate PM in the air on that date (measured in micrograms per cubic meter) 硝化物含量
文件夹中有332个csv文件,每个文件名就是相应观测站的ID。
要求:
写一个三个参数的函数
data_mean(dir,pollutant,1:332)
能够返回相应观测站pollutant
的均值。dir 是数据所在的目录,pollutant
是sulfate
或者nitrate
提示:可能用到的函数:
pollutants <- function(directory, pollutant, id) {
fileName <- paste(formatC(id, width = 3, flag = "0"), ".csv", sep = "")
filePath <- paste(directory, "/", fileName, sep = "")
frame <- read.csv(filePath)
pollutantData <- frame[[pollutant]]
pollutantData[!is.na(pollutantData)]
}
data_mean <- function(directory, pollutant, id = 1:332) {
pollutantLists <- sapply(id, function(id) pollutants(directory, pollutant, id))
pollutants <- unlist(pollutantLists)
mean(pollutants)
}
data_mean("pollution_data","sulfate",1:10)
## [1] 4.064
data_mean("pollution_data", "nitrate", 67:72)
## [1] 1.186
data_mean("pollution_data","sulfate",23)
## [1] 0.8311
返回课程主页。