impute_dt
描述
将data_frame的列输入其平均值、中位数或众数。
Usage
impute_dt(.data, …, .func = “mode”)
Arguments
| .data | A data.frame |
|---|---|
| … | Columns to select |
| .func | 字符,“模式”(默认),“平均值”或“中值”。也可以自己定义。 |
Pclass <- c(3, 1, 3, 1, 3, 2, 2, 3, NA, NA)Sex <- c('male', 'male', 'female', 'female', 'female','female', NA, 'male', 'female', NA)Age <- c(22, 38, 26, 35, NA,45, 25, 39, 28, 40)SibSp <- c(0, 1, 3, 1, 2, 3, 2, 2, NA, 0)Fare <- c(7.25, 71.3, 7.92, NA, 8.05, 8.46, 51.9, 60, 32, 15)Embarked <- c('S', NA, 'S', 'Q', 'Q', 'S', 'C', 'S', 'C', 'S')data <- data.frame('Pclass' = Pclass,'Sex' = Sex, 'Age' = Age, 'SibSp' = SibSp,'Fare' = Fare, 'Embarked' = Embarked)datadata %>% impute_dt() # defalut uses "mode" as `.func`data %>% impute_dt(is.numeric,.func = "mean")data %>% impute_dt(is.numeric,.func = "median")my_fun = function(x){x[is.na(x)] = (max(x,na.rm = TRUE) - min(x,na.rm = TRUE))/2x}data %>% impute_dt(is.numeric,.func = my_fun)
